Method of designing molecular structure of enzyme inhibitor

ABSTRACT

A method of designing a molecular structure of an inhibitor of a target enzyme by analyzing, in the tertiary structure of a first enzyme, a geometric configuration formed by amino acids to which an inhibitor on the first enzyme is to bind. The method comprises (1) a step of determining the amino acid group of the first enzyme to which the inhibitor is to bind, (2) a step of determining the geometric configuration of the amino acid group of the step (1), (3) a step of searching for an amino acid group having a geometric configuration similar to that of the step (2), in the target enzyme, and (4) a step of searching weather or not (i) the amino acids constituting the amino acid group are present at the surface of the target enzyme, and (ii) the amino acids have been conserved in different species by comparison with an enzyme derived from an organism of different types from the organism from which the target enzyme is derived. When the requirements are satisfied, it is assumed that the geometric configuration formed by amino acids to which the inhibitor of the first enzyme is to bind, is also present in the target enzyme, then the molecular structure of an inhibitor of the target enzyme is designed.

CROSS-REFERENCE TO RELATED APPLICATIONS

[0001] This is a Continuation Application of PCT Application No. PCT/JP00/08467, filed Nov. 30, 2000, which was not published under PCT Article 21(2) in English.

BACKGROUND OF THE INVENTION

[0002] 1. Field of the Invention

[0003] The present invention relates to a method of designing an enzyme inhibitor.

[0004] More specifically, the present invention relates to a method of designing a molecular structure of an inhibitor on a target enzyme, the secondary and tertiary structures of which being known, comprising: analyzing, in the tertiary structure of a first enzyme whose primary, secondary and tertiary structures are known, a geometric configuration formed by amino acids to which a first inhibitor, having biochemical, inhibitory activity on the first enzyme, is to bind; and searching, in the target enzyme, the similar geometric configuration to the geometric configuration formed in the first enzyme, thereby designing a molecular structure of an inhibitor on the target enzyme.

[0005] Further, the present invention relates to another method of designing a molecular structure of an inhibitor on a target enzyme whose tertiary structure is unknown, comprising: analyzing, in the tertiary structure of a first enzyme whose primary, secondary and tertiary structures are known, a geometric configuration formed by amino acids to which a first inhibitor having biochemical, inhibitory activity on the first enzyme, is to bind; and searching, in a second enzyme whose primary, secondary and tertiary structures are known, the similar geometric configuration to the geometric configuration formed in the first enzyme, thereby designing the molecular structure of the inhibitor on the target enzyme.

[0006] 2. Description of the Related Art

[0007] DNA is a substance which carry genetic information. It is well known that “DNA replication” as a reaction in which DNA is doubled, “DNA repair” as a reaction in which errors and damages in the structure thereof are repaired, and “DNA recombination” as a reaction in which genetic information is exchanged, are the basis of all phenomena of living organism. A series of DNA synthesis reactions as described above are carried out mainly by the enzymes called DNA polymerase.

[0008] As DNA polymerase (hereinafter also referred to as “pol.”) of eucaryote (mainly mammals), there have been discovered 8 types of DNA polymerase, which are identified as pol. α, β, γ, δ, ε, ζ, η and θ, according to the molecular weight, features in terms of protein-chemistry and differences in a function of DNA synthesis (1-4) (Note that the number in the parenthesis represents the number of the references listed in the reference section described below). It is assumed that these enzymes play a variety of roles, respectively. Specifically, pol. α, δ and ε are presumably involved in DNA replication of nucleus (1), pol. γ is presumably involved in DNA replication of mitochondria (1), and pol. β, ζ, η and θ are presumably involved in repair of nucleus DNA (1-4) Further, it has been suggested that pol. β is also involved in DNA recombination (1). However, biochemical roles of DNA polymerase in DNA synthesis reactions, are still unclear in many aspects.

[0009] Well-known examples of a molecular biological technique, which is often employed in the study of the roles played by enzymes, generally include: a method of producing a mutant having deficiency of the enzyme, and thereby investigating the phenotype; or a method of producing an individual whose gene encoding the enzyme has been knocked-out, thereby investigating the phenotype. However, in the case of DNA polymerase, application of such methods as described above most likely results in the death of the organism. Thus, it is difficult to analyze the roles of the enzyme by using such methods.

[0010] On the other hand, there has been another approach in which the functions of the DNA polymerase are studied on the basis of the analysis of an inhibitor on the enzyme. Conventionally, an inhibitor on an enzyme is searched by using natural substances, chemical files and the like. However, such conventional methods are basically the methods in which an inhibitor is discovered simply “by chance” after an enormous number of tests are carried out, which is by no means efficient.

[0011] If the structure of the amino acid residues constituting an enzyme can be identified up to the tertiary level thereof, the molecular structure of an inhibitor on the enzyme will more easily be deduced. However, an enzyme generally has a large molecular weight and a relatively complicated three-dimensional structure, and it is thus quite often difficult to analyze the tertiary structure thereof.

[0012] In consideration of the problems described above, an object of the present invention is to provide a method of efficiently discovering an inhibitor on an enzyme without carrying out a number of trial and error. Particularly, the object of the present invention is to provide a method of efficiently discovering an inhibitor on an enzyme whose molecular weight is approximately 30,000 or more, and thus, the tertiary structure of which cannot be identified with NMR, without carrying out a number of trial and error.

BRIEF SUMMARY OF THE INVENTION

[0013] The inventors of the present invention have discovered as a result of intense study, a method of efficiently designing molecular structure of an inhibitor on an enzyme to complete the present invention. The method comprises identifying the amino acid residues of a specific enzyme, to which residues an inhibitor is to bind in an interaction between the enzyme and the inhibitor, identifying the geometric configuration formed by the Cα atoms of each of the amino acid residues, and analyzing by search (1) whether or not these amino acids are those present at the surface of the enzyme protein and (2) whether or not these amino acids are those which have been conserved in terms of the evolution theory. By the method, the number of tests required for designing a molecular structure of an inhibitor on an enzyme can be significantly reduced, as compared with the conventional method where the inhibitor is searched at random.

[0014] Additional objects and advantages of the invention will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention. The objects and advantages of the invention may be realized and obtained by means of the instrumentalities and combinations particularly pointed out hereinafter.

BRIEF DESCRIPTION OF THE SEVERAL VIEWS OF THE DRAWING

[0015] The accompanying drawings, which are incorporated in and constitute a part of the specification, illustrate presently preferred embodiments of the invention, and together with the general description given above and the detailed description of the preferred embodiments given below, serve to explain the principles of the invention.

[0016]FIG. 1 is a block diagram showing the scheme of each process of a first method of the present invention;

[0017]FIG. 2A is a view showing the amino acid sequence of rat pol. β. FIG. 2B is a view showing the amino acid sequence of enzyme DNA topoisomerase II (hereinafter also referred to as “topo II”);

[0018]FIG. 3 is a view of a three-dimensional model showing the amino acid chain of rat pol. β;

[0019]FIG. 4 is a block diagram showing the scheme of each process of a second method of the present invention;

[0020]FIG. 5 is a graph showing the inhibitory activity of nervonic acid on rat pol. β;

[0021]FIG. 6 is ¹H-¹⁵N HMQC spectroscopy of rat pol. β;

[0022]FIG. 7 is a bar graph showing shift values of ¹H HMQC spectroscopy of rat pol. β;

[0023]FIG. 8 is a bar graph showing shift values of ¹⁵N HMQC spectroscopy of rat pol. β;

[0024]FIG. 9A is a view showing the geometric configuration of the amino acid residues of rat pol. β, which residues are to be bound to nervonic acid. FIG. 9B is a view showing the geometric configuration of the amino acid residues of yeast topo II, which residues are to be bound to nervonic acid;

[0025]FIG. 10A is a view showing a three-dimensional model of rat pol. β, in which the position of the amino acid residues, which residues are to be bound to nervonic acid, is clearly indicated. FIG. 10B is a view showing a three-dimensional model of yeast topo II, in which the position of the amino acid residues, which residues are to be bound to nervonic acid, is clearly indicated;

[0026] FIGS. 11A-11F are views showing the primary sequence of amino acids of 31 types of enzymes used in the search of example 1;

[0027]FIG. 12A is a view showing the geometric configuration of the amino acid residues of rat pol. β, which residues are to be bound to nervonic acid. FIG. 12B is a view showing the geometric configuration of the amino acid residues of human immunodeficiency virus (HIV) reverse transcriptase, which residues are to be bound to nervonic acid;

[0028]FIG. 13A is a view showing a three-dimensional model of rat pol. β, in which the position of the amino acid residues, which residues are to be bound to nervonic acid, is clearly indicated. FIG. 13B is a view showing a three-dimensional model of HIV reverse transcriptase, in which the position of the amino acid residues, which residues are to be bound to nervonic acid, is clearly indicated;

[0029]FIG. 14A is a view showing a model in a state in which nervonic acid has bound to the 8 kDa domain of rat pol. β. FIG. 14B is a view in which the model of FIG. 14A has been turned by 90° around the vertical axis;

[0030]FIG. 15 is a graph showing the inhibitory activity of nervonic acid on rat DNA polymerase, HIV reverse transcriptase and the like;

[0031]FIG. 16 is a view showing the result of electrophoresis, which represents the inhibitory activity effected by nervonic acid on yeast topo II;

[0032]FIG. 17 is a view showing the primary sequence of amino acids of yeast topo II or the like, in which the amino acid residues which have been conserved, in terms of the evolution theory, are indicated;

[0033]FIG. 18 shows ¹H-¹⁵N HMQC spectroscopy of rat pol. β;

[0034]FIG. 19 is a bar graph showing shift values of ¹H HMQC spectroscopy of rat pol. β;

[0035]FIG. 20 is a bar graph showing shift values of ¹⁵N HMQC spectroscopy of rat pol. β;

[0036]FIG. 21A is a view showing the geometric configuration of the amino acid residues of rat pol. β, which residues are to be bound to lithocholic acid. FIG. 21B is a view showing the geometric configuration of the amino acid residues of yeast topo II, which residues are to be bound to lithocholic acid;

[0037]FIG. 22A is a view showing a three-dimensional model of rat pol. β, in which the position of the amino acid residues, which residues are to be bound to lithocholic acid, is clearly indicated. FIG. 22B is a view showing a three-dimensional model of yeast topo II, in which the position of the amino acid residues, which residues are to be bound to lithocholic acid, is clearly indicated; and

[0038]FIG. 23 is a view showing the primary sequence of amino acids of yeast topo II or the like, in which the amino acid residues which have been conserved, in terms of the evolution theory, are indicated.

DETAILED DESCRIPTION OF THE INVENTION

[0039] According to one aspect of the present invention, there is provided a method of designing a molecular structure of an inhibitor on an enzyme whose secondary and tertiary structures are known. Hereinafter, this method will be referred to as “the first method of the present invention”.

[0040] Specifically, the first method of the present invention is a method of designing a molecular structure of an inhibitor on a target enzyme whose secondary and tertiary structures are known, comprising:

[0041] analyzing, in the tertiary structure of a first enzyme whose primary, secondary and tertiary structures are known, a geometric configuration formed by amino acids to which a first inhibitor having biological, inhibitory activity on the first enzyme, is to bind; and

[0042] searching, in the target enzyme, amino acids constituting the similar geometric configuration to the geometric configuration that was analyzed in the first enzyme;

[0043] thereby designing the molecular structure of the inhibitor on the target enzyme.

[0044] In more detail, the first method of the present invention is a method of designing a molecular structure of an inhibitor on a target enzyme whose secondary and tertiary structures are known, comprising:

[0045] (1) a step of determining, in the tertiary structure f a first enzyme, a first pocket amino acid group consisting of plural amino acids to which a first inhibitor is to bind, wherein the plural amino acids are determined by comparing the ¹H-¹⁵N HMQC spectroscopy of the first enzyme in the absence of the first inhibitor with the ¹H-¹⁵N HMQC spectroscopy of the first enzyme in the presence of the first inhibitor, and selecting amino acids that satisfy the condition that the absolute value of ¹H chemical shift is equal to or more than a specific value and/or the absolute value of ¹⁵N chemical shift is equal to or more than another specific value;

[0046] (2) a step of determining the geometric configuration of the first pocket amino acid group, by measuring, in the tertiary structure of the first enzyme, the distances between the Cα atoms of each of the plural amino acids constituting the first pocket amino acid group;

[0047] (3) a step of detecting, in the tertiary structure of the target enzyme, one or more candidates for a target pocket amino acid group having a geometric configuration similar to the geometric configuration formed by the first pocket amino acid group determined in the step (2), wherein it is presumed that the candidate for the target pocket amino acid group is present in the target enzyme if the following conditions are satisfied:

[0048] respective amino acids constituting the candidate for the target pocket amino acid group are of the same kinds as the plural amino acids constituting the first pocket amino acid group determined in the step (2), and

[0049] the absolute values of differences between (a) respective distances, in the tertiary structure of the amino acids of the target enzyme, between the Cα atoms of each of the amino acids constituting the target pocket amino acid group, and (b) the respective distances between the Cα atoms in the first pocket amino acid group obtained in the step (2) are within a specific value; and

[0050] (4) a step of screening the amino acids constituting the candidate for the target pocket amino acid group by the following two requirements:

[0051] (requirement 1) the amino acids are those present at the surface of the target enzyme in the tertiary structure thereof; and

[0052] (requirement 2) the amino acids are those that have been conserved among species, when a comparison is made between the primary sequence of the amino acids of the target enzyme and the primary sequence of the amino acids of another enzyme having the same biochemical activity as the target enzyme but derived from an organism of a different species from that of the target enzyme, in terms of biological classification; and

[0053] determining the amino acids satisfying the above two requirements to be those constituting the usable target pocket amino acid group for designing the molecular structure of the inhibitor on the target enzyme.

[0054] In the first method of the present invention, the molecular structure of the inhibitor on the target enzyme is designed on the basis of the kinds of amino acid residues of the first enzyme, to which the first inhibitor is to bind, and the geometric configuration of amino acid residues.

[0055] An example of carrying out the first method of the present invention mentioned above is as follows. However, the first method of the present invention is not limited to this.

[0056] A method of designing a molecular structure of an inhibitor on human immunodeficiency virus (HIV) reverse transcriptase by analyzing, in the tertiary structure of rat DNA polymerase β, a geometric configuration formed by amino acids to which nervonic acid is to bind; searching, in HIV reverse transcriptase, the similar geometric configuration to the geometric configuration analyzed in rat DNA polymerase β; thereby designing the molecular structure of the inhibitor on HIV reverse transcriptase on the basis of the kinds of amino acid residues of rat DNA polymerase β, to which nervonic acid is to bind, and the geometric configuration of the amino acid residues.

[0057] In more detail, the example of carrying out the first method of the present invention is a method of designing a molecular structure of an inhibitor on HIV reverse transcriptase comprising the following steps (1) to (4):

[0058] (1) a step of determining, in the tertiary structure of rat DNA polymerase β, its pocket amino acid group consisting of plural amino acids to which nervonic acid is to bind. In the step (1), the plural amino acids are determined by comparing the ¹H-¹⁵N HMQC spectroscopy of rat DNA polymerase β in the absence of nervonic acid with the ¹H-¹⁵N HMQC spectroscopy of rat DNA polymerase β in the presence of nervonic acid, and selected from those having the absolute value of ¹H chemical shift of 0.06 ppm or more, and/or having the absolute value of ¹⁵N chemical shift of 0.4 ppm or more. In this example, the plural amino acids constituting the rat DNA polymerase β pocket amino acid group are determined to be Leu-11, Lys-35, His-51 and Thr-79.

[0059] (2) a step of determining the geometric configuration of the rat DNA polymerase β pocket amino acid group. In the step (2), the distances, in the tertiary structure of rat DNA polymerase β, between the Cα atoms of each of Leu-11, Lys-35, His-51 and Thr-79 constituting the rat DNA polymerase β pocket amino acid group are measured, thereby the geometric configuration of a triangular pyramid formed by Leu-11, Lys-35, His-51 and Thr-79 is determined.

[0060] (3) a step of detecting, in the tertiary structure of the amino acids of HIV reverse transcriptase, one or more candidates for a target pocket amino acid group having a geometric configuration similar to the geometric configuration of the rat DNA polymerase β pocket amino acid group determined in the step (2). In the step (3), amino acids constituting the candidate for the target pocket amino acid group are the same type as those constituting the rat DNA polymerase β pocket amino acid group, i.e., Leu, Lys, His and Thr. Further, the differences between (a) and (b) are within ±2×10⁻¹ nm, wherein (a) is respective distances, in the tertiary structure of HIV reverse transcriptase, between the Cα atoms of each of the amino acids constituting the target pocket amino acid group, and (b) is the respective distances between the Cα atoms obtained in the step (2).

[0061] (4) a step of determining by screening whether or not the candidate for a target pocket amino acid group is usable for designing the molecular structure of the inhibitor on HIV reverse transcriptase. In the step (4), the amino acids, Lys, Leu, His and Thr, which are selected as the amino acids constituting the candidate for the target pocket amino acid group are screened by the following two requirements:

[0062] (requirement 1) the amino acids are those present at the surface of HIV reverse transcriptase in the tertiary structure thereof; and

[0063] (requirement 2) the amino acids are those that have been conserved among species, when a comparison is made between the primary sequence of the amino acids of HIV reverse transcriptase and the primary sequence of the amino acids of immunodeficiency virus reverse transcriptase that is derived from an organism of a different species, in terms of biological classification, from human.

[0064] In this example, Lys-65, Leu-100, His-235 and Thr-386 of HIV reverse transcriptase satisfying the above two requirements were determined to be amino acids constituting the target pocket amino acid group usable for designing a molecular structure of the inhibitor on HIV reverse transcriptase.

[0065] In the example of carrying out the first method of the present invention, the molecular structure of the inhibitor on HIV reverse transcriptase is designed on the basis of the kinds of amino acid residues of rat DNA polymerase β, to which nervonic acid is to bind, and the geometric configuration of the amino acid residues.

[0066] In another aspect of the present invention, there is provided a method of designing a molecular structure of an inhibitor on an enzyme whose tertiary structures is unknown. Hereinafter, this method will be referred to as “the second method of the present invention”.

[0067] Specifically, the second method of the present invention is a method of designing a molecular structure of an inhibitor on a target enzyme whose tertiary structure is unknown, comprising:

[0068] analyzing, in the tertiary structure of a first enzyme whose primary, secondary and tertiary structures are known, a geometric configuration formed by amino acids to which a first inhibitor having biological, inhibitory activity on the first enzyme is to bind; and

[0069] searching, in a second enzyme whose primary, secondary and tertiary structures are known, amino acids constructing the similar geometric configuration to the geometric configuration that was analyzed in the first enzyme;

[0070] thereby designing the molecular structure of the inhibitor on the target enzyme whose tertiary structure is unknown.

[0071] In more detail, the second method of the present invention is a method of designing a molecular structure of an inhibitor on a target enzyme whose tertiary structure is unknown, comprising:

[0072] (1) a step of determining, in the tertiary structure of a first enzyme, a first pocket amino acid group consisting of plural amino acids to which a first inhibitor is to bind, wherein the plural amino acids are determined by comparing the ¹H-¹⁵N HMQC spectroscopy of the first enzyme in the absence of the first inhibitor with the ¹H-¹⁵N HMQC spectroscopy of the first enzyme in the presence of the first inhibitor, and selecting amino acids that satisfy the condition that the absolute value of ¹H chemical shift is equal to or more than a specific value and/or the absolute value of ¹⁵N chemical shift is equal to or more than another specific value;

[0073] (2) a step of determining the geometric configuration of the first pocket amino acid group, by measuring, in the tertiary structure of the first enzyme, the distances between the Cα atoms of each of the plural amino acids constituting the first pocket amino acid group;

[0074] (3) a step of detecting, in the tertiary structure of a second enzyme, a second pocket amino acid group having a geometric configuration similar to the geometric configuration formed by the first pocket amino acid group determined in the step (2), wherein it is presumed that the second pocket amino acid group has a geometric configuration similar to the geometric configuration formed by the first pocket amino acid group if the following requirements are satisfied:

[0075] (requirement 1) the second enzyme has the same biochemical activity as that of the target enzyme;

[0076] (requirement 2) the organism from which the second enzyme is derived, belongs to a different species, in terms of biological classification, from the organism from which the target enzyme is derived;

[0077] (requirement 3) the primary, secondary and tertiary structures of the second enzyme are known; and

[0078] (requirement 4) the second enzyme is biochemically inhibited by the first inhibitor, and

[0079] respective amino acids constituting the second pocket amino acid group are of the same kinds as the plural amino acids constituting the first pocket amino acid group determined in the step (2), and

[0080] the absolute values of differences between (a) respective distances, in the tertiary structure of the second enzyme, between the Cα atoms of each of the amino acids constituting the second pocket amino acid group, and (b) the respective distances between the Cα atoms in the first pocket amino acid group obtained in the step (2) are within a specific value; and

[0081] (4) a step of screening the amino acids constituting the second pocket amino acid group by the following two requirements:

[0082] (requirement 1) the amino acids are those present at the surface of the second enzyme in the tertiary structure thereof; and

[0083] (requirement 2) the amino acids are those that have been conserved among species, when a comparison is made among the primary sequence of the amino acids of the second enzyme, the primary sequence of the amino acid of the target enzyme and the primary sequence of the amino acids of another enzyme having the same biochemical activity as that of the second enzyme but derived from an organism of a different species from that of both the second enzyme and the target enzyme, in terms of biological classification; and

[0084] determining the amino acids satisfying the above two requirements to be those constituting the usable target pocket amino acid group for designing the molecular structure of the inhibitor on the target enzyme.

[0085] In the second method of the present invention, the molecular structure of the inhibitor on the target enzyme is designed on the basis of the kinds of amino acid residues of the first enzyme, to which the first inhibitor is to bind, and the geometric configuration of the amino acid residues.

[0086] An example of carrying out the second method of the present invention mentioned above is as follows. However, the second method of the present invention is not limited to this.

[0087] A method of designing a molecular structure of an inhibitor on human DNA topoisomerase II whose tertiary structure is unknown by analyzing, in the tertiary structure of rat DNA polymerase β, a geometric configuration formed by amino acids to which nervonic acid is to bind; searching, in yeast DNA topoisomerase II, the similar geometric configuration to the geometric configuration that was analyzed in rat DNA polymerase β; thereby designing the molecular structure of the inhibitor on human topoisomerase II on the basis of the kinds of amino acid residues of yeast DNA topoisomerase II, to which nervonic acid is to bind, and the geometric configuration of the amino acid residues.

[0088] In more detail, the example of carrying out the second method of the present invention is a method of designing a molecular structure of an inhibitor on human topoisomerase II comprising the following steps (1) to (4):

[0089] (1) a step of determining, in the tertiary structure of rat DNA polymerase β, a rat DNA polymerase β pocket amino acid group consisting of plural amino acids to which nervonic acid is to bind. In the step (1), the plural amino acids are determined by comparing the ¹H-¹⁵N HMQC spectroscopy of rat DNA polymerase β in the absence of nervonic acid with the ¹H-¹⁵N HMQC spectroscopy of rat DNA polymerase β in the presence of nervonic acid, and selected from those having the absolute value of ¹H chemical shift of 0.06 ppm or more, and/or having the absolute value of ¹⁵N chemical shift of 0.4 ppm or more. In this example, the plural amino acids constituting the rat DNA polymerase β pocket amino acid group are determined to be Leu-11, Lys-35, His-51 and Thr-79.

[0090] (2) a step of determining the geometric configuration of the rat DNA polymerase β pocket amino acid group. In the step (2), the distances, in the tertiary structure of rat DNA polymerase β, between the Cα atoms of each of Leu-11, Lys-35, His-51 and Thr-79 constituting the rat DNA polymerase β pocket amino acid group are measured, thereby the geometric configuration of a triangular pyramid formed by Leu-11, Lys-35, His-51 and Thr-79 is determined.

[0091] (3) a step of selecting, in the tertiary structure of yeast DNA topoisomerase II, a yeast DNA topoisomerase II pocket amino acid group consisting of Leu, Lys, His and Thr, and having a geometric configuration similar to the geometric configuration of the rat DNA polymerase β pocket amino acid group determined in the step (2). In the step (3), when the differences between (a) and (b) are within ±2×10⁻¹ nm, wherein (a) is respective distances, in the tertiary structure of the selected amino acids, between the Cα atoms of Leu, Lys, His and Thr constituting the yeast DNA topoisomerase II pocket amino acid group, and (b) is the respective distances between the Cα atoms obtained in the step (2), such Leu, Lys, His and Thr are determined to be the amino acids constituting the yeast DNA topoisomerase II pocket amino acid group.

[0092] (4) a step of determining by screening whether or not the yeast DNA topoisomerase II pocket amino acid group is usable for designing the molecular structure of the inhibitor on human DNA topoisomerase II. In the step (4), the amino acids, Leu, Lys, His and Thr, which are selected as the amino acids constituting the yeast DNA topoisomerase II pocket amino acid group are screened by the following two requirements:

[0093] (requirement 1) the amino acids are those present at the surface of yeast DNA topoisomerase II in the tertiary structure thereof; and

[0094] (requirement 2) the amino acids are those that have been conserved among species, when a comparison is made among the primary sequence of the amino acids of yeast DNA topoisomerase II, the primary sequence of the amino acids of human DNA topoisomerase II and the primary sequence of the amino acids of topoisomerase II derived from an organism of a different species from both yeast and human, in terms of biological classification; and

[0095] determining the amino acids, Thr-596, His-735, Leu-741 and Lys-983, satisfying the above two requirements to be usable for designing the molecular structure of the inhibitor on the target enzyme.

[0096] In the example of carrying out the second method of the present invention, the molecular structure of the inhibitor on human topoisomerase II is designed on the basis of the kinds of amino acid residues of yeast DNA topoisomerase II, to which nervonic acid is to bind, and the geometric configuration of the amino acid residues.

[0097] Another example of carrying out the second method of the present invention mentioned above is as follows. However, the second method of the present invention is not limited to this, either.

[0098] A method of designing a molecular structure of an inhibitor on human DNA topoisomerase II whose tertiary structure is unknown; by analyzing, in the tertiary structure of rat DNA polymerase β, a geometric configuration formed by amino acids to which lithocholic acid is to bind; searching, in yeast DNA topoisomerase II, the similar geometric configuration to the geometric configuration that was analyzed in rat DNA polymerase β; thereby designing the molecular structure of the inhibitor on human topoisomerase II on the basis of the kinds of amino acid residues of yeast DNA topoisomerase II, to which lithocholic acid is to bind, and the geometric configuration of the amino acid residues.

[0099] In more detail, the example of carrying out the second method of the present invention is a method of designing a molecular structure of an inhibitor on human topoisomerase II comprising the following steps (1) to (4):

[0100] (1) a step of determining, in the tertiary structure of rat DNA polymerase β, a rat DNA polymerase β pocket amino acid group consisting of plural amino acids to which lithocholic acid is to bind. In the step (1), the plural amino acids are determined by comparing the ¹H-¹⁵N HMQC spectroscopy of rat DNA polymerase β in the absence of lithocholic acid with the ¹H-¹⁵N HMQC spectroscopy of rat DNA polymerase β in the presence of lithocholic acid, and selected from those having the absolute value of ¹H chemical shift of 0.06 ppm or more, and/or having the absolute value of ¹⁵N chemical shift of 0.4 ppm or more. In this example, the plural amino acids constituting the rat DNA polymerase β pocket amino acid group are determined to be Lys-60, Leu-77 and Thr-79.

[0101] (2) a step of determining the geometric configuration of the rat DNA polymerase β pocket amino acid group. In the step (2), the distances, in the tertiary structure of rat DNA polymerase β, between the Cα atoms of each of Lys-60, Leu-77 and Thr-79 constituting the rat DNA polymerase β pocket amino acid group are measured, thereby the geometric configuration of a triangular shape formed by Lys-60, Leu-77 and Thr-79 is determined.

[0102] (3) a step of selecting, in the tertiary structure of yeast DNA topoisomerase II, a yeast DNA topoisomerase II pocket amino acid group consisting of Lys, Leu and Thr, and having a geometric configuration similar to the geometric configuration of the rat DNA polymerase β pocket amino acid group determined in the step (2). In the step (3), when the differences between (a) and (b) are within ±2×10⁻¹ nm, wherein (a) is respective distances, in the tertiary structure of the selected amino acids, between the Cα atoms of Lys, Leu and Thr constituting the yeast DNA topoisomerase II pocket amino acid group, and (b) is the respective distances between the Cα atoms obtained in the step (2), such Lys, Leu and Thr of the amino acids of rat polymerase β pocket amino acid group are determined to be the amino acids constituting the yeast DNA topoisomerase II pocket amino acid group.

[0103] (4) a step of determining by screening whether or not the yeast DNA topoisomerase II pocket amino acid group is usable for designing the molecular structure of the inhibitor on human DNA topoisomerase II. In the step (4), the amino acids, Lys, Leu and Thr, which are is selected as the amino acids constituting the yeast DNA topoisomerase II pocket amino acid group are screened by the following two requirements:

[0104] (requirement 1) the amino acids are those present at the surface of yeast DNA topoisomerase II in the tertiary structure thereof; and

[0105] (requirement 2) the amino acids are those that have been conserved among species, when a comparison is made among the primary sequence of the amino acids of yeast DNA topoisomerase II, the primary sequence of the amino acids of human DNA topoisomerase II and the primary sequence of the amino acids of topoisomerase II derived from an organism of a different species from both yeast and human, in terms of biological classification; and

[0106] determining the amino acids, Lys-720, Leu-760 and Thr-791, satisfying the above two requirements to be those constituting the target pocket amino acid group usable for designing the molecular structure of the inhibitor on the target enzyme.

[0107] In the example of carrying out the second method of the present invention, the molecular structure of the inhibitor on human topoisomerase II is designed on the basis of the kinds of amino acid residues of yeast DNA topoisomerase II, to which lithocholic acid is to bind, and the geometric configuration of the amino acid residues.

[0108] Now, the first method of the present invention, in which a molecular structure of an inhibitor on an enzyme whose secondary and tertiary structures are known will be described in more detail.

[0109]FIG. 1 is a block diagram showing the scheme of each process of the first method of the present invention.

[0110] The first method of the present invention is a method of designing a molecular structure of an inhibitor on a target enzyme whose secondary and tertiary structures are known, comprising; analyzing, in the tertiary structure of a first enzyme whose primary, secondary and tertiary structures are known, a geometric configuration formed by amino acids to which a first inhibitor having biological, inhibitory activity on the first enzyme is to be bound; and searching, in the target enzyme, amino acids constituting the similar geometric configuration to the geometric configuration that was analyzed in the first enzyme, thereby designing the molecular structure of the inhibitor on the target enzyme.

[0111] In the first method of the present invention, the primary, secondary and tertiary structures of the first enzyme used in the step (1) are known. Herein, the expression “the primary, secondary and tertiary structures are known” does not mean that every amino acid is in a state where assignment of its cross peak by NMR has been completed, but rather means that the primary, secondary and tertiary structures of the enzyme have been identified as a whole of the enzyme. In the present invention, it is convenient that using an enzyme whose primary, secondary and tertiary structures have already known. However, it is also possible to apply the first method of the present invention after analyzing the primary, secondary and tertiary structures of an enzyme. The primary, secondary and tertiary structures of an enzyme can be analyzed by employing the techniques known in the art, such as biochemical technique, NMR and X-ray crystal structure analysis, in combination.

[0112] One example of the enzyme whose primary, secondary and tertiary structures have been identified is rat DNA polymerase β (hereinafter also referred to as “rat pol. β”). Rat pol. β is the only enzyme whose primary, secondary and tertiary structures are known, among the DNA polymerase group which represent the enzymes that are essential to living organisms and catalyze the reaction of synthesizing DNA as the main body of genes.

[0113] The amino acid sequence of rat pol. β is shown in reference 5, and the tertiary structure thereof (including the secondary structure) is shown in references 6 and 7. FIG. 2A shows the primary sequence of amino acid residues of rat pol. β (refer to the SQE ID No. 1 of the Sequence Listing described below). The amino acid sequence is available from SWISS-Plot Date Base. FIG. 3 is a schematic view showing the tertiary structure of rat pol. β, in which the principal chain of the amino acids is represented as a ribbon.

[0114] As shown in FIG. 3, rat pol. β is a single peptide constituted of the 8 kDa domain (1-87 amino acid residues) on the N-terminal side and the 31 kDa domain (88-335 amino acid residues) on the C-terminal side, in which the 31 kDa domain is connected with the 8 kDa domain. The 8 kDa domain can be cut off from the 31 kDa domain by mild hydrolysis with trypsin. Attribution of cross peaks by NMR has been completed in 75 residues of the total 87 amino acid residues of the 8 kDa domain (refer to reference 9).

[0115] The first inhibitor on the first enzyme used in the first method of the present invention is preferably an inhibitor whose inhibitory activity (e.g., IC₅₀) is in the order of μM or lower, although the first inhibitor is not particularly restricted thereto. The manner of inhibition may be either competition or non-competition. When the enzyme used in the first method of the present invention is rat pol. β, the first inhibitor is preferable to inhibit, in a competitive manner, the binding of rat pol. β to the DNA strand as the substrate.

[0116] Examples of the inhibitor having biochemical inhibitory activity on rat pol. β include nervonic acid (cis-15-tetracosenoic acid) and lithocholic acid (3-α-hydroxy-5β-cholanoic acid).

[0117] The chemical formulae of nervonic acid and lithocholic acid are shown below:

[0118] Nervonic acid is one of the fatty acids which most strongly inhibit rat pol. β. Lithocholic acid has been reported by Ogawa et al. (Jpn. J. Cancer Res. 89, 1154-1159, 1998), for example, as a representative example of steroids which are known as inhibitors on DNA polymerase. Both nervonic acid and lithocholic acid are commercially available and can easily be obtained by large amounts.

[0119] In the step (1) of the first method of the present invention, a first pocket amino acid group consisting of plural amino acids to which the first inhibitor is to bind, is determined in the tertiary structure of the first enzyme.

[0120] The plural amino acids constituting the first pocket amino acid group are decided by measuring the ¹H-¹⁵N HMQC (heteronuclear multiple-quantum correlated spectroscopy) of the first enzyme in the absence of the first inhibitor and the ¹H-¹⁵N HMQC spectroscopy of the first enzyme in the presence of the first inhibitor; overlapping the cross peak of each spectroscopy; and selecting amino acids that satisfy the condition that at least one of the absolute values of ¹H and ¹⁵N chemical shift of the amino acids is equal to or more than a specific value. Herein, the shift value is the one which can be set in accordance with the type or the like of the enzyme and the inhibitor used in the first method of the present invention. Although the values of ¹H and ¹⁵N chemical shift are generally set such that the number of the amino acid residues to be selected is three to four, the number of the amino acid residues is not restricted thereto. In the present invention, the number of amino acids is set in a range of three to four, because the minimum number of amino acids required to specify a geometric configuration formed by the amino acid residues is three and also the number of amino acids to be selected should not be uselessly enlarged in order to avoid making the identification task of the geometric configuration complicated.

[0121] The ¹H-¹⁵N HMQC spectroscopy can be measured by using Varian Unity-Plus 750 spectrometer, manufactured by Varian Co., Ltd. The measurement condition of the ¹H-¹⁵N HMQC spectroscopy may be set according to reference 4 listed below.

[0122] The amount of the inhibitor to be added when the ¹H-¹⁵N HMQC spectroscopy is measured in the presence of the inhibitor may be set by biochemically analyzing, in advance, the amount of the inhibitor which is required for inhibiting the enzymatic activity.

[0123] In the first method of the present invention, when rat pol. β and nervonic acid are used as the combination of the first enzyme and the fist inhibitor, respectively, the amount of nervonic acid to be added may be equimolar to the amount of rat pol. β. The absolute values of ¹H and ¹⁵N chemical shift as the indices may be set at 0.06 ppm or larger and 0.4 ppm or larger, respectively. In the amino acid sequence of rat pol. β, the amino acid residues that satisfy such ¹H or ¹⁵N chemical shift, are the following four amino acid residues: Leu-11, Lys-35, His-51 and Thr-79.

[0124] Further, in the first method of the present invention, when rat pol. βand lithocholic acid are used as the combination of the first enzyme and the fist inhibitor, respectively, the amount of lithocholic acid to be added may be equimolar to the amount of rat pol. β. The absolute values of ¹H and ¹⁵N chemical shift as the indices may be set at 0.06 ppm or larger and 0.6 ppm or larger, respectively. In the amino acid sequence of rat pol. β, the amino acid residues that satisfy such ¹H or ¹⁵N chemical shift are the following three amino acid residues: Lys-60, Leu-77 and Thr-79.

[0125] In the step (2) of the first method of the present invention, the geometric configuration of the first pocket amino acid group selected in the step (1) is determined. Specifically, the determination of the geometric configuration is carried out by obtaining, in the tertiary structure of the first enzyme, the distances between the Cα atoms of each of the amino acids constituting the first pocket amino acid group.

[0126] The distances between the Cα atoms can be calculated by using Protein Data Bank (Protein data Bank: 1BNO in the case of rat pol. β, and Protein data Bank: 1BJT in the case of yeast topo II), whereby the geometric configuration formed by the Cα atoms of the amino acids constituting the first pocket amino acid group can be analyzed. The technique of this analysis is described in detail in reference 4 listed below.

[0127] In the step (3) of the first method of the present invention, one or more candidates for a target pocket amino acid group having a geometric configuration similar to the geometric configuration of the rat DNA polymerase β pocket amino acid group determined in the step (2) is detected in the target enzyme. The detection is carried out by selecting the plural amino acids of the same kinds as the respective amino acids constituting the first pocket amino acid group determined in the step (2). Also, the distances, in the tertiary structure of the target enzyme, between the Cα atoms of each of the selected plural amino acids meet the following condition. That is, when comparison is made between (a) and (b), wherein (a) is the respective distances between the Cα atoms of each of the amino acids constituting the target pocket amino acid group, and (b) is the respective distances between the Cα atoms in the first pocket amino acid group obtained in the step (2), the differences in the corresponding distances between (a) and (b) are within a specific value.

[0128] Herein, “the corresponding distances between the Cα atoms of each of the amino acids” means that when the amino acids constituting the first pocket amino acid group are expressed as A, B and C and the amino acids constituting the target pocket amino acid group are expressed as a, b and c, the distance between the Cα atoms of A and B corresponds to the distance between the Cα atoms of a and b; the distance between the Cα atoms of B and C corresponds to the distance between the Cα atoms of b and c; and the distance between the Cα atoms of A and C corresponds to the distance between the Cα atoms of a and c. The amino acids A and a are the same kind to each other. Similarly, the amino acids B and b are the same kind, and the amino acids C and c are the same kind.

[0129] The differences in the corresponding distances between the Cα atoms of each of the amino acids may generally be set within a range of ±2×10⁻¹ nm, although the range may be changed in accordance with the type or the like of the first enzyme, the target enzyme and the first inhibitor used. It should be noted, however, that a too large difference in the corresponding distances between the Cα atoms of each of the amino acids represents a large difference between the geometric configuration formed by the amino acid residues constituting the first pocket amino acid group and that formed by the amino acid residues constituting the target pocket amino acid group, which is not preferable because the precision of the geometric configuration of the inhibitor on the target enzyme will then deteriorate.

[0130] In the first method of the present invention, when rat pol. β, nervonic acid and HIV reverse transcriptase are used as the first enzyme, the first inhibitor and the target enzyme, respectively, Leu, His, Thr and Lys having a geometric configuration similar to that of Leu-11, Lys-35, His-51 and Thr-79 of rat pol. β are Lys-65, Leu-100, His-235 and Thr-386 of HIV reverse transcriptase, which was decided as the candidate for a target pocket amino acid.

[0131] In the step (4) of the first method of the present invention, it is determined by searching whether or not the candidate for a target pocket amino acid group detected in the step (3) is usable for designing a molecular structure of an inhibitor on the target enzyme.

[0132] One feature of the first method of the present invention resides in this searching method. Specifically, the search is conducted on the basis of whether or not the amino acids constituting the candidate for a target pocket amino acid group determined in the step (3) satisfy the following two requirements:

[0133] (requirement 1) the amino acids are those present at the surface of the target enzyme in the tertiary structure thereof; and

[0134] (requirement 2) the amino acids are those that have been conserved among species, when a comparison is made between the primary sequence of the amino acids of the target enzyme and the primary sequence of the amino acids of another enzyme having the same biochemical activity as the target enzyme but derived from an organism of a different species from that of the target enzyme, in terms of biological classification.

[0135] When the amino acids of the candidate for a target pocket amino acid group determined in the step (3) satisfy the above two requirements, it is determined that the candidate(s) for a target pocket amino acid group is (are) usable for designing a molecular structure of an inhibitor on the target enzyme.

[0136] In requirement 1 of the step (4), it is required that the amino acids are those present at the surface of the target enzyme in the tertiary structure thereof, because it is presumed that such a pocket amino acid group must be present at the surface of the target enzyme in order for a specific substance (an inhibitor) to bind to the pocket amino acid group and inhibit the enzymatic activity of the target enzyme.

[0137] Further, in requirement 2 of the step (4), it is required that the amino acids are those that have been conserved among species, when a comparison is made between the primary sequence of the amino acids of the target enzyme and the primary sequence of the amino acids of another enzyme having the same biochemical activity as the target enzyme but derived from an organism of a different species from that of the target enzyme, in terms of biological classification. The reason thereof is that in a living organism, when amino acids which are essential for maintaining the activity of an enzyme (i.e., amino acids which are directly involved in the activity of the enzyme) are changed, the enzyme decreases its activity and, as a result, the living organism dies. That is, the requirement 2 is on the basis of an idea that the amino acids which play important roles in enzymatic activities will probably be maintained throughout the process of evolution of the organism.

[0138] Herein, the term “biological classification” represents genetic classification on the basis of the evolution theory of living organisms including virus. Accordingly, examples of “enzyme derived from an organism of a different species, in terms of biological classification” include enzymes derived from organisms belonging to different kingdoms, e.g., human (animal kingdom) vs. pea (plant kingdom), human (animal kingdom) vs. yeast (fungus kingdom), or enzymes derived from organisms of different species in the same animal kingdom (such as rat and mouse). In the case of virus, examples of “enzyme derived from an organism of a different species, in terms of biological classification” include enzymes derived from viruses whose hosts are of different species, in terms of biological classification (e.g., HIV vs. monkey immunodeficiency virus) and enzymes derived from viruses whose hosts are of the same species but whose genetic types are different (e.g., HIV-1 vs. HIV-2).

[0139] In the first method of the present invention, it is preferable that the search is carried out with respect to species which are as distant, in terms of both genetics and the evolution theory, as possible.

[0140] The technique which determines by searching whether or not any homology in amino acid sequence exists between a variety of different organisms has been practiced in the fields of molecular biology, information biology and the like. The present invention, however, has applied, for the first time, this technique in the field of molecular design.

[0141] It is preferable that plural types of enzyme are used in requirement 2 of the step (4), which enzymes are derived from an organism of a different species, in terms of biological classification, from the organism from which the target enzyme is derived. It is more preferable that at least ten enzymes from different species are searched. The larger the number of the enzymes, the higher precision is resulted in search, which is preferable.

[0142] With regards to the judgment on whether or not the amino acid is conserved in different species, the sample which exhibits homology of no less than 80%, in general, and preferably homology of 100%, is regarded as having “conserved amino acid”. It is preferable that as many amino acids as possible, of those constituting the pocket amino acid group, are the amino acids exhibiting 100% homology between the species. However, it is not necessary that all of the amino acids constituting the pocket amino acid group are the amino acids exhibiting 100% homology between the species. It suffices that at least one of the amino acids constituting the pocket amino acid group is the amino acid exhibiting 100% homology between the species.

[0143] Whether or not the amino acid satisfies the two requirements defined in the step (4) can be determined by using a computer software which is commercially available (INSIGHT II/BINDING SITE ANALYSIS, manufactured by MSI (Molecular Simulations Inc., U.S.A.)).

[0144] In the first method of the present invention, a molecular structure of a novel inhibitor on the target enzyme is designed, on the basis of the kinds of amino acid residues of the first enzyme, to which the first inhibitor is to bind, and the three-dimensional configuration of the amino acid residues.

[0145] Specifically, designing of a molecular structure of a novel inhibitor on the target enzyme is carried out by using the method including: specifying the substituent groups of the first inhibitor, which substituent groups are to bind to the amino acids constituting the first pocket amino acid group; adjusting the distances between each of the substituent groups and the three-dimensional configuration thereof such that these substituent groups are configured so that they fit in the geometric configuration formed by the first pocket amino acid group. The types of the substituent groups which are to bind to the amino acid residues constituting the first pocket amino acid group can be specified by using a biochemical method or the like which is known to those skilled in the art.

[0146] As described above, the first method of the present invention enables a molecular structure design of an inhibitor on an enzyme whose tertiary structure is determined and whose enzymatic activity is different from that of the first enzyme. Further, as a modification of the first method of the present invention, it is possible to design a molecular structure of another novel inhibitor on the first enzyme.

[0147] Specifically, the novel inhibitor on the first enzyme may be designed so that it has substituent groups that can be bound more firmly to the amino acids in the geometric configuration identified with the first enzyme in the steps (1) and (2) of the first method of the present invention and that accords the geometric configuration of the first enzyme.

[0148] More specifically, in the binding between the amino acids, Leu-11, Lys-35, His-5 and Thr-79, of rat pol. β and nervonic acid, the carboxyl group at one terminal end of the nervonic acid chain is to bind to the hydrophilic site of Lys-35, and the methyl group at the other terminal end of the nervonic acid chain is to bind to Leu-11, His-51 and Thr-79, as shown in FIG. 14. The bent structure of the nervonic acid chain due to the double bond therein allows the nervonic acid chain to be fitted in the three-dimensional hydrophobic pocket formed by Leu-11, Lys-35, His-51 and Thr-79 of rat pol. β.

[0149] Accordingly, a molecular structure of a novel inhibitor other than nervonic acid, on rat pol. β, can be designed by adjusting the length of a fatty acid chain and the position of double bond so that the thus adjusted fatty acid is fitted in the three-dimensional pocket formed by Leu-11, Lys-35, His-51 and Thr-79 of rat pol. β in a more fitting manner.

[0150] Next, the second method of the present invention will be described in more detail.

[0151] The second method of the present invention is a method of designing a molecular structure of an inhibitor on a target enzyme whose tertiary structure is unknown, comprising: analyzing, in the tertiary structure of a first enzyme whose primary, secondary and tertiary structures are known, a geometric configuration formed by amino acids to which a first inhibitor having biological, inhibitory activity on the first enzyme is to be bound; and searching, in a second enzyme whose primary, secondary and tertiary structures are known, amino acids constructing the similar geometric configuration to the geometric configuration that was analyzed in the first enzyme; thereby designing the molecular structure of the inhibitor on the target enzyme.

[0152]FIG. 4 is a block diagram showing the scheme of each process of the second method of the present invention.

[0153] The second method of the present invention is similar to the first method of the present invention, in that the second method employs basically the same technique utilized in the steps (1) to (4) of the latter.

[0154] That is, the first method of the present invention is to design a molecular structure of an inhibitor on a second enzyme, i.e., the target enzyme whose primary, secondary and tertiary structure have been identified, by utilizing the geometric configuration of a first enzyme. The second method of the present invention employs similar techniques on two types of enzymes, as in the first method, however, the second method is different from the first method in that the former then is to design a molecular structure of an inhibitor on a third enzyme, i.e., the target enzyme whose tertiary structure is unknown.

[0155] Herein, “a target enzyme” represents, as described above, an enzyme to which a molecular structure of an inhibitor is to be designed by implementing the methods of the present invention.

[0156] An enzyme generally has a relatively large molecular weight and a relatively complicated three-dimensional structure. Therefore, it is often extremely difficult in terms of technology to determine the tertiary structure, although the primary and secondary structures thereof can somehow be identified. Accordingly, in a group of enzymes that have the same biochemical activity but that exist in different organisms, there may be a case in which the tertiary structures of amino acids constituting the enzymes existing in some kinds of organism are known, while those of the enzymes existing in the rest of the organisms remain unknown. For example, DNA topoisomerase II, topo II, is an enzyme that enables breaking and reconnecting the double strand DNA, thereby eliminating twisting of DNA. Topo II is thus possessed by all the eucaryotes having a capacity of changing topology of DNA. Only the primary sequence of the amino acid of human topo II has been identified, although the amino acid sequence of topo II of yeast (S. cerevisiae) has been identified up to the tertiary structure level (reference 14 and reference 15 listed below). The second method of the present invention can be used for efficiently designing a molecular structure of an inhibitor on such an enzyme whose tertiary structure is unknown.

[0157] As described above, topo II is an enzyme capable of breaking and reconnecting the double strand DNA, thereby eliminating twisting of DNA and through which changing the topology of DNA. Since such a change in topology is essential in the development of cell division, the inhibitor on human topo II is likely to be applicable as an anticancer drug.

[0158] Next, the three types of enzymes, i.e., a first enzyme, a second enzyme and a target enzyme, employed in the second method of the present invention will be described in detailed below.

[0159] The structure of first and second enzymes have been identified up to the tertiary structural level. On the contrary, the target enzyme, at least the primary sequence of the amino acid thereof has been identified, but the tertiary structure thereof remains unknown. The first enzyme has biochemical activity of different type from that of the second and target enzymes. The second and target enzymes have biochemical activity of the same type, but they are derived from organisms of biologically different types to each other. The second method of the present invention can be used for designing a molecular structure of an inhibitor on an enzyme i.e., the target enzyme, which, due to the large molecular weight thereof, does not allow structural analysis by NMR.

[0160] A single type of inhibitor is employed against the enzyme used in the second method of the present invention. The inhibitor is capable of inhibiting the activity of the first enzyme. This inhibitor will be referred to as “the first inhibitor” hereinafter.

[0161] Specific examples of the combination of the first, second and target enzymes, and the first inhibitor, include the combination of rat pol. β, yeast topo II and human topo II, and nervonic acid; and the combination of rat pol. β, yeast topo II, and human topo II, and lithocholic acid.

[0162] In the second method of the present invention, the term “pocket amino acid group” is used for each of the first, second and target enzymes. As described below, plural amino acids, in the enzyme, satisfying a specific condition, is allocated to each of the pocket amino acid groups.

[0163] In the first enzyme used in the step (1) of the second method of the present invention, the primary, the secondary and the tertiary structures are known. Herein, the expression “the primary, secondary and tertiary structures are known” does not mean that every amino acid is in a state where assignment of its cross peak by NMR has been completed, but rather means that the primary, secondary and tertiary structures of the enzyme have been identified as a whole of the enzyme.

[0164] In the present invention, it is convenient that using an enzyme whose primary, secondary and tertiary structures have already known. However, it is also possible to apply the second method of the present invention after analyzing the primary, secondary and tertiary structures of an enzyme. The primary, secondary and tertiary structures of an enzyme can be analyzed by employing the techniques known in the art, such as biochemical technique, NMR and X-ray crystal structure analysis, in combination.

[0165] One example of the enzyme whose primary, secondary and tertiary structures have been identified is rat pol. β. It is the only one enzyme whose primary, secondary and tertiary structures are known, among the DNA polymerase group which represent the enzymes that are essential to living organisms and catalyze the reaction of synthesizing DNA as the main body of genes.

[0166] The first inhibitor on the first enzyme used in the second method of the present invention is preferably an inhibitor whose inhibitory activity (e.g., IC₅₀) is in the order of μM or lower, although the first inhibitor is not particularly restricted thereto. The manner of inhibition may be either competition or non-competition. When the enzyme used in the second method of the present invention is rat pol. β, the first inhibitor is preferable to inhibit, in a competitive manner, the binding of rat pol. β to the DNA strand as the substrate.

[0167] Examples of the inhibitor having biochemical inhibitory activity on rat pol. β include nervonic acid and lithocholic acid.

[0168] In the step (1) of the second method of the present invention, a first pocket amino acid group consisting of plural amino acids to which the first inhibitor is to bind, is determined in the tertiary structure of the first enzyme.

[0169] The plural amino acids constituting the first pocket amino acid group are decided by measuring the ¹H-¹⁵N HMQC of the first enzyme in the absence of the first inhibitor and the ¹H-¹⁵N HMQC spectroscopy of the first enzyme in the presence of the first inhibitor; overlapping the cross peak of each spectroscopy; and selecting amino acids that satisfy the condition that at least one of the absolute values of ¹H and ¹⁵N chemical shift of the amino acids is equal to or more than a specific value. Herein, the shift value is the one which can be set in accordance with the type or the like of the enzyme and the inhibitor used in the second method of the present invention. Although the values of ¹H and ¹⁵N chemical shift are generally set such that the number of the amino acid residues to be selected is three to four, the number of the amino acid residues is not restricted thereto. In the present invention, the number of amino acids is set in a range of three to four, because the minimum number of amino acids required to specify a geometric configuration formed by the amino acid residues is three and also the number of amino acids to be selected should not be uselessly enlarged in order to avoid making the identification task of the geometric configuration complicated.

[0170] The ¹H-¹⁵N HMQC spectroscopy can be measured by using Varian Unity-Plus 750 spectrometer, manufactured by Varian Co., Ltd. The measurement condition of the ¹H-¹⁵N HMQC spectroscopy may be set according to reference 4 listed below.

[0171] The amount of the inhibitor to be added when the ¹H-¹⁵N HMQC spectroscopy is measured in the presence of the inhibitor may be set by biochemically analyzing, in advance, the amount of the inhibitor which is required for inhibiting the enzymatic activity.

[0172] In the second method of the present invention, when rat pol. β and nervonic acid are used as the combination of the first enzyme and the fist inhibitor, respectively, the amount of nervonic acid to be added may be equimolar to the amount of rat pol. β. The absolute values of ¹H and ¹⁵N chemical shift as the indices may be set at 0.06 ppm or larger and 0.4 ppm or larger, respectively. In the amino acid sequence of rat pol. β, the amino acid residues that satisfy such ¹H or ¹⁵N chemical shift, are the following four amino acid residues: Leu-11, Lys-35, His-51 and Thr-79.

[0173] Further, in the second method of the present invention, when rat pol. β and lithocholic acid are used as the combination of the first enzyme and the fist inhibitor, respectively, the amount of lithocholic acid to be added may be equimolar to the amount of rat pol. β. The absolute values of ¹H and ¹⁵N chemical shift as the indices may be set at 0.06 ppm or larger and 0.6 ppm or larger, respectively. In the amino acid sequence of rat pol. β, the amino acid residues that satisfy such ¹H or ¹⁵N chemical shift are the following three amino acid residues: Lys-60, Leu-77 and Thr-79.

[0174] In the step (2) of the second method of the present invention, the geometric configuration of the first pocket amino acid group selected in the step (1) is determined. Specifically, the determination of the geometric configuration is carried out by obtaining, in the tertiary structure of the first enzyme, the distances between the Cα atoms of each of the amino acids constituting the first pocket amino acid group.

[0175] The distances between the Cα atoms can be calculated by using Protein Data Bank (Protein data Bank: 1BNO in the case of rat pol. β, and Protein data Bank: 1BJT in the case of yeast topo II), whereby the geometric configuration formed by the Cα atoms of the amino acids constituting the first pocket amino acid group can be analyzed. The technique of this analysis is described in detail in reference 4 listed below.

[0176] In the step (3) of the second method of the present invention, a second pocket amino acid group having a geometric configuration similar to the geometric configuration of the first pocket amino acid group identified in the step (2) is detected in the second enzyme. The second enzyme used in the step (3) satisfies the following requirements:

[0177] (requirement 1) the second enzyme has the same biochemical activity as the target enzyme;

[0178] (requirement 2) the organism from which the second enzyme is derived, belongs to a different type, in terms of biological classification, from the organism from which the target enzyme is derived;

[0179] (requirement 3) the primary, secondary and tertiary structures of the second enzyme are known; and

[0180] (requirement 4) the second enzyme is biochemically inhibited by the first inhibitor.

[0181] The expression that “(an enzyme) has the same biochemical activity as the target enzyme” defined in requirement 1 of the step (3) means that the enzyme catalyzes, in the enzymatic reaction thereof, substantially the same substrate as the target enzyme does, thereby generating substantially the same product in the biochemical reaction.

[0182] The expression that “the organism from which the second enzyme is derived belongs to a different type, in terms of biological classification, from the organism from which the target enzyme is derived” defined in requirement 2 of the step (3) represents the same meaning as described in the first method. Further, as is in the first method, it is preferable that the organism from which the second enzyme is derived, is as distant as possible, in terms of biological classification, from the organism from which the target enzyme is derived.

[0183] With regards to “the primary, secondary and tertiary structure of an enzyme” defined in requirement 3 of the step (3), it is convenient that an enzyme whose primary, secondary and tertiary structures are already known is used, as described above. However, it is possible to apply the second method of the present invention after analyzing the primary, secondary and tertiary structures of an enzyme. The primary, secondary and tertiary structures of an enzyme can be analyzed by employing the techniques known in the art, such as biochemical technique, NMR and X-ray crystallographic analysis, in combination.

[0184] The expression that “(the second enzyme) is biochemically inhibited by the first inhibitor” defined in requirement 4 of the step (3) represents that the value of the inhibitory activity IC₅₀ of the first inhibitor effected against the second enzyme is preferably in a range 10 times to {fraction (1/10)} times as much as, and more preferably substantially the same as the value of the inhibitory activity IC₅₀ of the first inhibitor against the first enzyme.

[0185] In the step (3) of the second method of the present invention, a second pocket amino acid group having a geometric configuration similar to that of the first pocket amino acid group is detected in the second enzyme. The detection is carried out by selecting the plural amino acids of the same kinds as the respective amino acids constituting the first pocket amino acid group determined in the step (2). Also, the distances, in the tertiary structure of the second enzyme, between the Cα atoms of each of the selected plural amino acids meet the following condition. That is, when comparison is made between (a) and (b), wherein (a) is the respective distances between the Cα atoms of each of the amino acids constituting the second pocket amino acid group, and (b) is the respective distances between the Cα atoms in the first pocket amino acid group obtained in the step (2), the differences between (a) and (b), i.e., the corresponding distances between the Cα atoms of each of the amino acids, are within a specific value.

[0186] Herein, “the corresponding distances between the Cα atoms of each of the amino acids” has the same meaning as described in the first method of the present invention.

[0187] The corresponding distances between the Cα atoms of each of the amino acids may generally be set within a range of ±2×10⁻¹ nm, although the range may be changed in accordance with the type or the like of the first enzyme, the second enzyme and the first inhibitor used. It should be noted, however, if the large distance may be set, it makes difference large between the geometric configurations of the first pocket amino acid group and the second pocket amino acid group. It is not preferable because the precision of the geometric configuration of the inhibitor on the target enzyme will then deteriorate.

[0188] When rat pol. β, yeast topo II, human topo II and nervonic acid are used as the first, second, target enzymes and the first inhibitor, respectively, the specific value may be set within a range of ±2×10⁻¹ nm. Similarly, the specific value may be set within the same range when lithocholic acid is used as the first inhibitor.

[0189] In the step (4) of the second method of the present invention, it is determined by searching whether or not the second pocket amino acid group determined in the step (3) is usable for designing a molecular structure of an inhibitor of the target enzyme.

[0190] One of the features of the second method of the present invention resides in the searching method. Specifically, the search is conducted on the basis of whether or not the second pocket amino acid group satisfies the following two requirements:

[0191] (requirement 1) the amino acids are those present at the surface of the target enzyme in the tertiary structure thereof; and

[0192] (requirement 2) the amino acids are those that have been conserved among species, when a comparison is made among the primary sequence of the amino acids of the second enzyme, the primary sequence of the amino acids of the target enzyme and the primary sequence of the amino acids of another enzyme having the same biochemical activity as the second enzyme but derived from an organism of a different species from that of both the second and target enzymes, in terms of biological classification.

[0193] When the amino acids of the second pocket amino acid group satisfy the two requirements, it is determined that the second pocket amino acid group are usable for designing a molecular structure of an inhibitor on the target enzyme.

[0194] The reason why the two requirements are employed in the step (4) is the same as that described in the first method.

[0195] With regards to the number of the enzyme derived from different species, in terms of biological classification used in requirement 2 of the step (4), at least two enzymes (preferably at least ten enzymes) are searched. The larger the number of the enzyme, the higher precision is resulted in search, which is preferable.

[0196] With regards to the judgment on whether or not the amino acid is conserved in different species, the sample which exhibits homology of no less than 80%, in general, and preferably homology of 100%, is regarded as “conserved amino acid”. It is preferable that as many amino acids as possible, of those constituting the pocket amino acid group, are the amino acids exhibiting 100% homology between the species. However, it is not necessary that all of the amino acids constituting the pocket amino acid group are the amino acids exhibiting 100% homology between the species. It suffices that one of the amino acids constituting the pocket amino acid group is the amino acid exhibiting 100% homology between the species.

[0197] Whether or not the amino acid residues constituting the second pocket amino acid group determined in the step (3) has a three-dimensional configuration similar to the geometric configuration of the first pocket amino acid group, and whether or not the amino acid residues constituting the second pocket amino acid group satisfy the two requirements defined in the step (4), can be determined by using a computer software which is commercially available (INSIGHT II/BINDING SITE ANALYSIS, manufactured by MSI (Molecular Simulations Inc.)).

[0198] When rat pol. β, yeast topo II and human topo II, and nervonic acid are used as the first, second and target enzymes, and the first inhibitor, Thr-596, His-735, Leu-741 and Lys-983 among the amino acid residues constituting the second pocket amino acid group of yeast topo II satisfy each of the requirements in the step (3) and the step (4). On the other hand, when lithocholic acid is used as the first inhibitor, Lys-720, Leu-760 and Thr-791 satisfy these requirements.

[0199] In the second method of the present invention, when the second pocket amino acid group of the second enzyme satisfies the two requirements defined in the step (4), it is presumed that the target pocket amino acid group is present in the target enzyme. Specifically, it is presumed that respective amino acid residues constituting the target pocket amino acid group are of the same kinds as those constituting the second pocket amino acid group, and the amino acid residues of the target pocket amino acid group are arranged in a geometric configuration similar to that of the second amino acid group.

[0200] In the second method of the present invention, after the presumption as described above is carried out, a molecular structure of an inhibitor on the target enzyme is designed on the basis of the kinds of amino acid residues of the second enzyme, to which the first inhibitor is to bind, and the geometric configuration of the amino acid residues.

[0201] With regards to a specific designing method, that described in the first method can be employed.

[0202] Next, examples of the present invention will be described hereinafter. It should be noted that the present invention is not restricted to these examples.

EXAMPLE 1

[0203] In this example, rat pol. β, nervonic acid and HIV reverse transcriptase were used as the first enzyme, the first inhibitor and the target enzyme, respectively, for designing a molecular structure of an inhibitor on HIV reverse transcriptase.

[0204] <1-1> The primary sequence of the amino acid residue of rat pol. β is shown in FIG. 2A. The model of the tertiary structure of rat pol. β is shown in FIG. 3.

[0205] <1-2> Measurement of Inhibitory Activity Effected by Nervonic acid on Rat Pol. β

[0206] The activity of rat pol. β was measured according to the method described in reference 3 listed below, by using poly(dA)/oligo(dT)₁₂₋₁₈ (template DNA) and [³H]-dTTP (nucleotide) as substrates and measuring radiation of [³H]-dTTP incorporated in the presence of nervonic acid of different concentrations.

[0207]FIG. 5 is a graph showing an inhibition curve. In FIG. 5, the vertical axis represents the activity of rat pol. β and the horizontal axis represents the concentration of nervonic acid.

[0208] <1-3> Measurements of ¹H-¹⁵N HMQC Spectroscopy of Rat Pol. β in the Presence and the Absence of Nervonic Acid

[0209] A recombinant E. coli having 8 kDa domain of rat pol. β was constructed according to the method described in reference 4 listed below. ¹⁵NH₄Cl as the nitrogen source was added to culture medium of the recombinant E. coli for labeling. In the absence and the presence (1.25 mM, which is equimolar to 8 kDa domain of rat pol. β) of nervonic acid, ¹H-¹⁵N HMQC spectroscopy (750 Mz) was measured by using Varian Unity-Plus 750 spectrometer, manufactured by Varian Co., Ltd. The experiment was conducted only with 8 kDa domain of rat pol. β, because it has been confirmed that the amino acid residues to which nervonic acid is to bind are present only at the 8 kDa domain (references 3 and 4 listed below).

[0210]FIG. 6 shows the cross peaks of ¹H-¹⁵N HMQC spectroscopy of rat pol. β. In FIG. 6, the vertical axis represents the shift value of ¹⁵N and the horizontal axis represents the shift value of ¹H. FIG. 6 shows the cross peaks of amino acid residues whose ¹H and ¹⁵N chemical shifts were 0.015 ppm or more and 0.1 ppm or more, respectively, in a manner that the graph of the cross peaks in the presence of nervonic acid and the graph of the cross peaks in the absence of nervonic acid are overlapped. In FIG. 6, among the amino acid residues, those in which the absolute values of ¹H and ¹⁵N chemical shift were 0.06 ppm or more and 0.4 ppm or more, respectively, i.e., four amino acid residues including Leu-11, Lys-35, His-51 and Thr-79, are each indicated by one alphabetic letter representing the type of the amino acid and the sequence number thereof. The group consisting of these four amino acid residues will be referred to as “rat pol. βpocket amino acid group-1” hereinafter.

[0211]FIG. 7 is a bar graph in which the cross peaks of ¹H-¹⁵N HMQC spectroscopy shown in FIG. 6 is expressed as the relationship of ¹H chemical shift value vs. the amino acid residue number. FIG. 8 is a bar graph in which the cross peaks of ¹H-¹⁵N HMQC spectroscopy shown in FIG. 6 is expressed as the relationship of ¹⁵N chemical shift value vs. the amino acid residue number. In FIGS. 7 and 8, the vertical axis represents the chemical shift value and the horizontal axis represents the sequence number of the amino acid.

[0212] <1-4> Measurement of the Distances Between the Cα Atoms of the Rat Pol. β Pocket amino Acid Group-1

[0213] The distances between the Cα atoms of each of Leu-11, Lys-35, His-51 and Thr-79 constituting the rat pol. β pocket amino acid group-1 obtained in the <1-3> were measured by using Protein Data Bank; 1BNO and with reference to references 14, 15 and 16 listed below. The thus obtained distances between the Cα atoms are shown in Table 1. TABLE 1 Distance between the Cα atoms of the amino acid residues Distance (×10⁻¹ nm) Leu-11 and His-51 10.41 Leu-11 and Thr-79 12.79 Leu-11 and Lys-35 27.95 His-51 and Thr-79 15.95 His-51 and Lys-35 21.87 Thy-79 and Lys-35 24.88

[0214]FIG. 9A is a view showing, in three dimensional manner, the geometric configuration of the Cα atoms of the amino acid residues distanced from each other as shown in Table 1 (a computer software (INSIGHT II/BINDING SITE ANALYSIS, manufactured by Molecular Simulations Inc., U.S.A) was employed). Further, FIG. 10A is a view of a model in which the four amino acid residues constituting the rat pol. β pocket amino acid group-1, to which nervonic acid has bound, are indicated in rat pol. β represented as a ribbon.

[0215] <1-5> Determination, by Searching, of a Target Pocket Amino Acid Group on HIV Reverse Transcriptase

[0216] The target pocket amino acid group on HIV reverse transcriptase was searched by the following items 1) to 3) using a computer software (INSIGHT II/BINDING SITE ANALYSIS, manufactured by Molecular Simulations Inc., U.S.A).

[0217] Item 1) determination, by searching, of whether or not the target pocket amino acid group (HIV pocket amino acid group-1) corresponding to the rat pol. β pocket amino acid group-1 is present in the amino acid sequence of HIV reverse transcriptase;

[0218] Item 2) determination, by searching, of whether or not the amino acids constituting the HIV pocket amino acid group-i obtained in the above item 1) are present at the surface of HIV reverse transcriptase in the tertiary structure thereof; and

[0219] Item 3) determination, by searching, of whether or not the amino acids constituting the HIV pocket amino acid group-1 are the amino acids conserved in immunodeficiency virus reverse transcriptases between other types which derived from different organisms, in terms of biological classification.

[0220] The distances between each of the Cα atoms obtained above are shown in Table 2. TABLE 2 Distance between the Cα atoms of the amino acid residues Distance (×10⁻¹ nm) Leu-100 and His-235 10.94 Leu-100 and Thr-386 16.02 Leu-100 and Lys-65 25.22 His-235 and Thr-386 19.24 His-235 and Lys-65 20.86 Thy-386 and Lys-65 21.39

[0221] In the searching of the above item 1), the amino acids of HIV reverse transcriptase having such distances between each of the Cα atoms as within a range of ±2×10⁻¹ nm with respect to the corresponding distances between each of the Cα atoms in pol. β shown in Table 1, were selected. Further, in the searching of the above items 2) and 3), the 31 sequences of the immunodeficiency virus reverse transcriptase each derived from organisms of different species shown in FIGS. 11A-11F were used.

[0222] FIGS. 11A-11F shows the 31 sequences of the immunodeficiency virus reverse transcriptase, which correspond to SQE ID Nos. 3 to 33, respectively, of the Sequence Listing shown below. Note that only the amino acids up to No. 100 are shown in the Sequence Listing. Each of the amino acid sequences is available from SWISS-Plot Data Base. In the sequences of FIGS. 11A-11F, a hyphen(s) is (are) inserted between specific amino acids so that conservation of the amino acid sequences among any reverse transcriptases can easily be detected (the INSIGHT II/BINDING SITE ANALYSIS was used).

[0223] As shown in FIG. 11B, the Lys (K) as the amino acid No. 65 is conserved throughout the 31 enzymes searched. With regards to the other amino acids (e.g., the amino acid No. 100 shown in FIG. 11C) are not always conserved throughout the 31 enzymes. However, even if the amino acid is not completely conserved throughout the searched sequences, the amino acid which satisfies the items 1) and 2) mentioned above can be regarded as the amino acid constituting the target amino acid group. Herein, it is an essential requirement that the types of the amino acids constituting the rat pol. β pocket amino acid group-1 are identical with the types of the amino acids constituting HIV pocket amino acid group-1. Although the other requirements (i.e., the distances between the Cα atoms of each of the amino acids, the amino acids being present at the surface of the enzyme, and the amino acids being conserved among different organisms in terms of biological classification) are not necessarily essential, there is a decreasing priority in the order of mentioning in determining the target amino acid group, i.e., the requirement of the distances between the Cα atoms is given a highest priority.

[0224] As a result of the determination by searching, it has been confirmed that, in HIV reverse transcriptase, Lys-65, Leu-100, His-235 and Thr-386 constituting HIV pocket amino acid group-1 satisfy the above items 1), 2) and 3).

[0225]FIG. 12B shows, in the three-dimensional manner, the geometric configuration of the Cα atoms of Lys-65, Leu-100, His-235 and Thr-386. FIG. 13B shows a three-dimensional model of HIV reverse transcriptase to which nervonic acid has bound. For comparison, FIG. 12A and FIG. 13A show the geometric configuration and the three-dimensional model, respectively, of rat pol. β to which nervonic acid has bound.

[0226] <1-6> Study of the Binding Manner of Nervonic Acid on Rat Pol. β

[0227]FIGS. 14A and 14B show a three-dimensional molecular model of nervonic acid (NA) and rat pol. β, in which nervonic acid has bound on the 8 kDa domain of rat pol. β (prepared by with “Mol Graph”, manufactured by DAIKIN INDUSTRIES, Ltd.). FIG. 14B is a view in which the model of FIG. 14A has been turned by 90° around the vertical axis.

[0228] <1-7>

[0229] The confirmation of the inhibitory activity of nervonic acid on HIV reverse transcriptase was carried out.

[0230] Specifically, the inhibitory activity of nervonic acid on HIV reverse transcriptase was measured according to the method described in reference 8 listed below.

[0231] The result is shown in FIG. 15.

[0232] As is obvious from the inhibition curve shown in FIG. 15, the inhibitory activity of nervonic acid on HIV reverse transcriptase is substantially equal to the inhibitory activity of nervonic acid on rat pol. β.

[0233] <1-8> Designing an Inhibitor on HIV Reverse Transcriptase

[0234] As shown in the molecular models of FIGS. 13A and 13B of the <1-5>, it can be assumed that the binding manner of nervonic acid with rat pol. β is substantially the same as the binding manner of an inhibitor with HIV reverse transcriptase. Thus, it is possible to propose, as one example of the molecular structure of a preferable inhibitor on HIV reverse transcriptase, a cis-fatty acid in which the number of carbon atoms of the carbon chain is in the range of 16 to 24 and one double bond is located on the side of terminal methyl group with respect to the center of the carbon chain.

[0235] Further, designing a molecular structure as described below is also possible.

[0236] A fatty acid like nervonic acid is easily metabolized in vivo and therefore studying the role of rat pol. β is quite difficult in vivo. Thus, by modifying the molecular structure of the inhibitor such that the inhibitor will be non-metabolizable in vivo, while maintaining the molecular structure which presumably effects inhibitory activity on HIV reverse transcriptase, it is possible to provide a compound which maintains the inhibitory activity thereof in a body of a living organism in vivo.

EXAMPLE 2

[0237] In this example, rat pol. β, yeast topo II, nervonic acid and human topo II were used as the first enzyme, the second enzyme, the first inhibitor and the third (target) enzyme, respectively, for designing a molecular structure of an inhibitor on human topo II.

[0238] <2-1> The geometric configuration of the binding sites of rat pol. β to which nervonic acid is to bind, was determined in a manner similar to that described in <1-4> of example 1.

[0239] <2-2> The Inhibitory Activity of Nervonic Acid on Yeast (S. cerevisiae) Topoisomerase II

[0240] The inhibitory activity of nervonic acid on yeast topo II was measured according to the method described in references 10 and 11.

[0241]FIG. 16 shows the results of the measurement. FIG. 16 is a view showing the result of the electrophoresis in which lane 1 represents the absence of nervonic acid and other lanes each represent different concentrations of nervonic acid, which are 100 μM, 50 μM and 25 μM.

[0242] In FIG. 16, the nick (Form II) represents the band of the reaction product and the super coil (Form I) represents the band of the unreacted substance.

[0243] <2-3> Determination, by Searching, of the Target Amino Acid Group on Yeast Topo II

[0244] Whether such amino acids as those satisfying the following items 1) to 3) exist or not was searched using a computer software (INSIGHT II/BINDING SITE ANALYSIS, manufactured by MSI).

[0245] Item 1) determination, by searching, of whether or not a pocket amino acid group (yeast topo II pocket amino acid group-1) corresponding to the rat pol. β pocket amino acid group-1 is present in the amino acid sequence of yeast topo II;

[0246] Item 2) determination, by searching, of whether or not the amino acids constituting the yeast topo II pocket amino acid group-1 are present at the surface of yeast topo II in the tertiary structure thereof; and

[0247] Item 3) determination, by searching, of whether or not the amino acids constituting the yeast topo II pocket amino acid group-1 are conserved among topo II enzymes from different species, in terms of biological classification, from yeast.

[0248] In the searching of the above item 1), the amino acids of yeast topo II having such distances between each of the Cα atoms as those within a range of ±2×10⁻¹ nm with respect to the corresponding distances between each of the Cα atoms in pol. β shown in Table 1, were selected. Further, in the search of the above item 3), 13 topo II enzymes derived from different species shown in FIG. 17 were used.

[0249]FIG. 17 shows a portion (amino acid Nos. 700 to 789) of the amino acid residues of 13 topo II enzymes, which correspond to SQE ID Nos. 34 to 46, respectively, of the Sequence Listing shown below.

[0250] From FIG. 17, it is revealed that His (H) of the amino acid No. 735 and Leu (L) of the amino acid No. 741 have been conserved, in terms of the evolution theory, in 13 topo II enzymes derived from different species.

[0251] As a result of the search, it was confirmed that Thr-596, His-735, Leu-741 and Lys-983 (the topo II pocket amino acid group-1) satisfy the above items 1) to 3).

[0252]FIG. 9B shows, in a three-dimensional manner, the geometric configuration of the Cα atoms of Thr-596, His-735, Leu-741 and Lys-983. Further, FIG. 10B indicates these four amino acid residues constituting the topo II pocket amino acid group-1, in a model in which the amino acid residues of topo II are expressed as ribbons.

[0253]FIG. 2B shows the amino acid sequence of yeast topo II (SEQ ID No. 2 of the Sequence Listing shown below). The amino acid sequence is available from SWISS-Plot Data Base: 1BJT.

[0254] When the sequence of yeast topo II shown in FIG. 2B is compared with the sequence of rat pol. β shown in FIG. 2A, the order in which Leu-11, Lys-35, His-51 and Thr-79 constituting the rat pol. β pocket amino acid group-1 appear in the primary sequence of the amino acid residues is different from the order in which Thr-596, His-735, Leu-741 and Lys-983 constituting the yeast topo II pocket amino acid group-1, except that the rat pol. β pocket amino acid group-1 and the topo II pocket amino acid group-1 have the same order of Leu and Lys therein. Further, the number of the amino acid residues present between each of the four amino acid residues is different, in the rat pol. β pocket amino acid group-1 and the topo II pocket amino acid group-1. Thus, it is very difficult to deduce the binding sites where nervonic acid is to bind to the two enzymes, from the primary sequences of the amino acid residues thereof.

EXAMPLE 3

[0255] An experiment was conducted in a manner similar to that of example 2, except that lithocholic acid was used instead of nervonic acid as the first inhibitor.

[0256] <3-1> Measurements of ¹H-¹⁵N HMQC Spectroscopy of Rat Pol. β in the Presence and the Absence of Lithocholic Acid

[0257]FIG. 18 shows the result of the ¹H-¹⁵N HMQC spectroscopy.

[0258]FIG. 19 is a bar graph in which the cross peaks of ¹H-¹⁵N HMQC spectroscopy shown in FIG. 18 is expressed as the relationship of ¹H chemical shift value vs. the amino acid residue number. FIG. 20 is a bar graph in which the cross peaks of ¹H-¹⁵N HMQC spectroscopy shown in FIG. 18 is expressed as the relationship of ¹⁵N chemical shift value vs. the amino acid residue number. In FIGS. 19 and 20, the vertical axis represents the chemical shift value and the horizontal axis represents the sequence number of the amino acid.

[0259] In this example, among the searched amino acid residues, those in which the absolute values of ¹H chemical shift and ¹⁵N chemical shift were 0.06 ppm or more and 0.6 ppm or more, respectively, were three amino acid residues: Lys-60, Leu-77 and Thr-79. The group consisting of these three amino acid residues will be referred to as “rat pol. β pocket amino acid group-2” hereinafter.

[0260] <3-2> Measurements of the Distances Between the Cα Atoms of Each of the Rat Pol. β Pocket Amino Acid Group-2

[0261] Table 3 shows the distances between the Cα atoms of the rat pol. β pocket amino acid group-2. TABLE 3 Distance between the Cα atoms of the amino acid residues Distance (×10⁻¹ nm) Lys-60 and Leu-77 14.97 Lys-60 and Thr-79 19.67 Leu-77 and Thr-79  5.70

[0262]FIG. 21A is a view showing, in the three dimensional manner, the geometric configuration of the Cα atoms of the amino acid residues distanced from each other as shown in Table 3. FIG. 22A is a view showing, in the three dimensional manner, the amino acid residues of the rat pol. β, in a binding state with lithocholic acid, and the distances between the Cα atoms of these amino acid residues. Further, FIG. 22A is a view indicating the three amino acid residues constituting the rat pol. β pocket amino acid group-2, to which lithocholic acid has bound, in a model in which the amino acid residues of rat pol. β are expressed as a line.

[0263] <3-3> Determination, by Searching, of the Target Amino Acid Group in Yeast Topo II

[0264] Item 1) determination, by searching, of whether or not a pocket amino acid group (yeast topo II pocket amino acid group-2) corresponding to the rat pol. ! pocket amino acid group-2 are present in the amino acid sequence of yeast topo II;

[0265] Item 2) determination of whether or not the amino acids constituting the yeast topo II pocket amino acid group-2 are present at the surface of yeast topo II in the tertiary structure thereof; and

[0266] Item 3) determination, by searching, of whether or not the amino acids constituting the yeast topo II pocket amino acid group-2 are the amino acids conserved in other topo II enzymes derived from different species, in terms of biological classification, from yeast.

[0267] In the searching of the above item 1), the amino acids of yeast topo II having such distances between each of the Cα atoms as those within a range of +2×10⁻¹ nm with respect to the corresponding distance between each of the Cα atoms in pol. β shown in Table 3, were selected.

[0268]FIG. 23 shows a portion (amino acid Nos. 700 to 780) of the amino acid residues of topo II of 13 types which correspond to SQE ID Nos. 34 to 46, respectively, of the Sequence Listing shown below.

[0269] From FIG. 23, it is understood that Lys (K) of the amino acid No. 720 and Leu (L) of the amino acid No. 760 have been conserved, in terms of the evolution theory, in 13 topo II enzymes derived from different species.

[0270] As a result of the search, it was confirmed that Lys-720, Leu-760 and Thr-791 (topo II pocket amino acid group-2) satisfy the above items 1) to 3).

[0271]FIG. 21B shows, in a three-dimensional manner, the geometric configuration of the Cα atoms of Lys-720, Leu-760 and Thr-791. Further, FIG. 22B indicates these three amino acid residues constituting the topo II pocket amino acid group-2, to which lithocholic acid has bound, in a model in which the amino acid residues of topo II are expressed as a line.

[0272] <3-4> Designing a Novel Inhibitor of Human Topo II

[0273] In consideration of the binding manner of lithocholic acid to rat pol. β described in the <2-6>, it is expected that a steroid skeleton having at least a carboxyl group in the vicinity of the terminal end of the molecular structure thereof will advantageously be used as the structure of an inhibitor of human topo II. Further, it will also be advantageous to add a side chain such as an alkyl group to the molecular structure, such that the inhibitor is to bind more fittingly to the triangle formed by Lys-720, Leu-760 and Thr-791.

REFERENCES

[0274] The entire contents disclosed in the following references are incorporated herein by reference.

[0275] (1) Mizushina et al., J. Antibiot. (Tokyo) 49, 491-492 (1996)

[0276] (2) Mizushina et al., Biochem. Biophys. Acta 1308, 256-262 (1996)

[0277] (3) Mizushina et al., Biochem. Biophys. Acta 1336, 509-521 (1997)

[0278] (4) Mizushina et al., Biol. Chem. 274, 25599-25607 (1999)

[0279] (5) B. Z. Zmudzka et al., Proc. Natl. Acad. Sci. U.S.A., 83, 5106-5110 (1986)

[0280] (6) H. Pelletier et al., Science, 264, 1891-1903 (1994)

[0281] (7) M. R. Sawaya et al., Science, 264, 1930-1935 (1994)

[0282] (8) Mizushina et al., Biochem. Biophys. Acta, 1336, 509-521 (1997)

[0283] (9) D. Liu et al., Biochemistry, 33, 9537-9545 (1994)

[0284] (10) F. Liu et al., Proc. Natl. Acad. Scie. USA 78, 3487-3491 (1981)

[0285] (11) A. Ferrp et al., J. Biol. Chem. 259, 547-554 (1984)

[0286] (12) D. Liu et al., Biochemistry, 35, 6188-6200 (1996)

[0287] (13) R. Prasad et al., J. Biol. Chem., 273, 111121-11126 (1998)

[0288] (14) Kobayashi et al., Nature Struct. Biol. 4, 677 (1997)

[0289] (15) Wallace et al., Protein Sci. 6, 2308-2323 (1997)

[0290] (16) Russell et al., J. Mol. Biol. 279, 1211-1227 (1998)

1 59 1 335 PRT rat DNA polymerase beta 1 Met Ser Lys Arg Lys Ala Pro Gln Glu Thr Leu Asn Gly Gly Ile Thr 1 5 10 15 Asp Met Leu Val Glu Leu Ala Asn Phe Glu Lys Asn Val Ser Gln Ala 20 25 30 Ile His Lys Tyr Asn Ala Tyr Arg Lys Ala Ala Ser Val Ile Ala Lys 35 40 45 Tyr Pro His Lys Ile Lys Ser Gly Ala Glu Ala Lys Lys Leu Pro Gly 50 55 60 Val Gly Thr Lys Ile Ala Glu Lys Ile Asp Glu Phe Leu Ala Thr Gly 65 70 75 80 Lys Leu Arg Lys Leu Glu Lys Ile Arg Gln Asp Asp Thr Ser Ser Ser 85 90 95 Ile Asn Phe Leu Thr Arg Val Thr Gly Ile Gly Pro Ser Ala Ala Arg 100 105 110 Lys Leu Val Asp Glu Gly Ile Lys Thr Leu Glu Asp Leu Arg Lys Asn 115 120 125 Glu Asp Lys Leu Asn His His Gln Arg Ile Gly Leu Lys Tyr Phe Glu 130 135 140 Asp Phe Glu Lys Arg Ile Pro Arg Glu Glu Met Leu Gln Met Gln Asp 145 150 155 160 Ile Val Leu Asn Glu Val Lys Lys Leu Asp Pro Glu Tyr Ile Ala Thr 165 170 175 Val Cys Gly Ser Phe Arg Arg Gly Ala Glu Ser Ser Gly Asp Met Asp 180 185 190 Val Leu Leu Thr His Pro Asn Phe Thr Ser Glu Ser Ser Lys Gln Pro 195 200 205 Lys Leu Leu His Arg Val Val Glu Gln Leu Gln Lys Val Arg Phe Ile 210 215 220 Thr Asp Thr Arg Ser Lys Gly Glu Thr Lys Phe Met Gly Val Cys Gln 225 230 235 240 Leu Pro Ser Glu Asn Asp Glu Asn Glu Tyr Pro His Arg Arg Ile Asp 245 250 255 Ile Arg Leu Ile Pro Lys Asp Gln Tyr Tyr Cys Gly Val Leu Tyr Phe 260 265 270 Thr Gly Ser Asp Ile Phe Asn Lys Asn Met Arg Ala His Ala Leu Glu 275 280 285 Lys Gly Phe Thr Ile Asn Glu Tyr Thr Ile Arg Pro Leu Gly Val Thr 290 295 300 Gly Val Ala Gly Glu Pro Leu Pro Val Asp Ser Glu Gln Asp Ile Phe 305 310 315 320 Asp Tyr Ile Gln Trp Arg Tyr Arg Glu Pro Lys Asp Arg Ser Glu 325 330 335 2 1429 PRT yeast DNA topoisomerase 2 Met Ser Thr Glu Pro Val Ser Ala Ser Asp Lys Tyr Gln Lys Ile Ser 1 5 10 15 Gln Leu Glu His Ile Leu Lys Arg Pro Asp Thr Tyr Ile Gly Ser Val 20 25 30 Glu Thr Gln Glu Gln Leu Gln Trp Ile Tyr Asp Glu Glu Thr Asp Cys 35 40 45 Met Ile Glu Lys Asn Val Thr Ile Val Pro Gly Leu Phe Lys Ile Phe 50 55 60 Asp Glu Ile Leu Val Asn Ala Ala Asp Asn Asn Lys Val Arg Asp Pro 65 70 75 80 Ser Met Lys Arg Ile Asp Val Asn Ile His Ala Glu Glu His Thr Ile 85 90 95 Glu Val Lys Asn Asp Gly Lys Gly Ile Pro Ile Glu Ile His Asn Lys 100 105 110 Glu Asn Ile Tyr Ile Pro Glu Met Ile Phe Gly His Leu Leu Thr Ser 115 120 125 Ser Asn Tyr Asp Asp Asp Glu Lys Lys Val Thr Gly Gly Arg Asn Gly 130 135 140 Tyr Gly Ala Lys Leu Cys Asn Ile Phe Ser Thr Glu Phe Ile Leu Glu 145 150 155 160 Thr Ala Asp Leu Asn Val Gly Gln Lys Tyr Val Gln Lys Trp Glu Asn 165 170 175 Asn Met Ser Ile Cys His Pro Pro Lys Ile Thr Ser Tyr Lys Lys Gly 180 185 190 Pro Ser Tyr Thr Lys Val Thr Phe Lys Pro Asp Leu Thr Arg Phe Gly 195 200 205 Met Lys Glu Leu Asp Asn Asp Ile Leu Gly Val Met Arg Arg Arg Val 210 215 220 Tyr Asp Ile Asn Gly Ser Val Arg Asp Ile Asn Val Tyr Leu Asn Gly 225 230 235 240 Lys Ser Leu Lys Ile Arg Asn Phe Lys Asn Tyr Val Glu Leu Tyr Leu 245 250 255 Lys Ser Leu Glu Lys Lys Arg Gln Leu Asp Asn Gly Glu Asp Gly Ala 260 265 270 Ala Lys Ser Asp Ile Pro Thr Ile Leu Tyr Glu Arg Ile Asn Asn Arg 275 280 285 Trp Glu Val Ala Phe Ala Val Ser Asp Ile Ser Phe Gln Gln Ile Ser 290 295 300 Phe Val Asn Ser Ile Ala Thr Thr Met Gly Gly Thr His Val Asn Tyr 305 310 315 320 Ile Thr Asp Gln Ile Val Lys Lys Ile Ser Glu Ile Leu Lys Lys Lys 325 330 335 Lys Lys Lys Ser Val Lys Ser Phe Gln Ile Lys Asn Asn Met Phe Ile 340 345 350 Phe Ile Asn Cys Leu Ile Glu Asn Pro Ala Phe Thr Ser Gln Thr Lys 355 360 365 Glu Gln Leu Thr Thr Arg Val Lys Asp Phe Gly Ser Arg Cys Glu Ile 370 375 380 Pro Leu Glu Tyr Ile Asn Lys Ile Met Lys Thr Asp Leu Ala Thr Arg 385 390 395 400 Met Phe Glu Ile Ala Asp Ala Asn Glu Glu Asn Ala Leu Lys Lys Ser 405 410 415 Asp Gly Thr Arg Lys Ser Arg Ile Thr Asn Tyr Pro Lys Leu Glu Asp 420 425 430 Ala Asn Lys Ala Gly Thr Lys Glu Gly Tyr Lys Cys Thr Leu Val Leu 435 440 445 Thr Glu Gly Asp Ser Ala Leu Ser Leu Ala Val Ala Gly Leu Ala Val 450 455 460 Val Gly Arg Asp Tyr Tyr Gly Cys Tyr Pro Leu Arg Gly Lys Met Leu 465 470 475 480 Asn Val Arg Glu Ala Ser Ala Asp Gln Ile Leu Lys Asn Ala Glu Ile 485 490 495 Gln Ala Ile Lys Lys Ile Met Gly Leu Gln His Arg Lys Lys Tyr Glu 500 505 510 Asp Thr Lys Ser Leu Arg Tyr Gly His Leu Met Ile Met Thr Asp Gln 515 520 525 Asp His Asp Gly Ser His Ile Lys Gly Leu Ile Ile Asn Phe Leu Glu 530 535 540 Ser Ser Phe Leu Gly Leu Leu Asp Ile Gln Gly Phe Leu Leu Glu Phe 545 550 555 560 Ile Thr Pro Ile Ile Lys Val Ser Ile Thr Lys Pro Thr Lys Asn Thr 565 570 575 Ile Ala Phe Tyr Asn Met Pro Asp Tyr Glu Lys Trp Arg Glu Glu Glu 580 585 590 Ser His Lys Phe Thr Trp Lys Gln Lys Tyr Tyr Lys Gly Leu Gly Thr 595 600 605 Ser Leu Ala Gln Glu Val Arg Glu Tyr Phe Ser Asn Leu Asp Arg His 610 615 620 Leu Lys Ile Phe His Ser Leu Gln Gly Asn Asp Lys Asp Tyr Ile Asp 625 630 635 640 Leu Ala Phe Ser Lys Lys Lys Ala Asp Asp Arg Lys Glu Trp Leu Arg 645 650 655 Gln Tyr Glu Pro Gly Thr Val Leu Asp Pro Thr Leu Lys Glu Ile Pro 660 665 670 Ile Ser Asp Phe Ile Asn Lys Glu Leu Ile Leu Phe Ser Leu Ala Asp 675 680 685 Asn Ile Arg Ser Ile Pro Asn Val Leu Asp Gly Phe Lys Pro Gly Gln 690 695 700 Arg Lys Val Leu Tyr Gly Cys Phe Lys Lys Asn Leu Lys Ser Glu Leu 705 710 715 720 Lys Val Ala Gln Leu Ala Pro Tyr Val Ser Glu Cys Thr Ala Tyr His 725 730 735 His Gly Glu Gln Ser Leu Ala Gln Thr Ile Ile Gly Leu Ala Gln Asn 740 745 750 Phe Val Gly Ser Asn Asn Ile Tyr Leu Leu Leu Pro Asn Gly Ala Phe 755 760 765 Gly Thr Arg Ala Thr Gly Gly Lys Asp Ala Ala Ala Ala Arg Tyr Ile 770 775 780 Tyr Thr Glu Leu Asn Lys Leu Thr Arg Lys Ile Phe His Pro Ala Asp 785 790 795 800 Asp Pro Leu Tyr Lys Tyr Ile Gln Glu Asp Glu Lys Thr Val Glu Pro 805 810 815 Glu Trp Tyr Leu Pro Ile Leu Pro Met Ile Leu Val Asn Gly Ala Glu 820 825 830 Gly Ile Gly Thr Gly Arg Ser Thr Tyr Ile Pro Pro Phe Asn Pro Leu 835 840 845 Glu Ile Ile Lys Asn Ile Arg His Leu Met Asn Asp Glu Glu Leu Glu 850 855 860 Gln Met His Pro Trp Phe Arg Gly Trp Thr Gly Thr Ile Glu Glu Ile 865 870 875 880 Glu Pro Leu Arg Tyr Arg Met Tyr Gly Arg Ile Glu Gln Ile Gly Asp 885 890 895 Asn Val Leu Glu Ile Thr Glu Leu Pro Ala Arg Thr Trp Thr Ser Thr 900 905 910 Ile Lys Glu Tyr Leu Leu Leu Gly Leu Ser Gly Asn Asp Lys Ile Lys 915 920 925 Pro Trp Ile Lys Asp Met Glu Glu Gln His Asp Asp Asn Ile Lys Phe 930 935 940 Ile Ile Thr Leu Ser Pro Glu Glu Met Ala Lys Thr Arg Lys Ile Gly 945 950 955 960 Phe Tyr Glu Arg Phe Lys Leu Ile Ser Pro Ile Ser Leu Met Asn Met 965 970 975 Val Ala Phe Asp Pro His Gly Lys Ile Lys Lys Tyr Asn Ser Val Asn 980 985 990 Glu Ile Leu Ser Glu Phe Tyr Tyr Val Arg Leu Glu Tyr Tyr Gln Lys 995 1000 1005 Arg Lys Asp His Met Ser Glu Arg Leu Gln Trp Glu Val Glu Lys 1010 1015 1020 Tyr Ser Phe Gln Val Lys Phe Ile Lys Met Ile Ile Glu Lys Glu 1025 1030 1035 Leu Thr Val Thr Asn Lys Pro Arg Asn Ala Ile Ile Gln Glu Leu 1040 1045 1050 Glu Asn Leu Gly Phe Pro Arg Phe Asn Lys Glu Gly Lys Pro Tyr 1055 1060 1065 Tyr Gly Ser Pro Asn Asp Glu Ile Ala Glu Gln Ile Asn Asp Val 1070 1075 1080 Lys Gly Ala Thr Ser Asp Glu Glu Asp Glu Glu Ser Ser His Glu 1085 1090 1095 Asp Thr Glu Asn Val Ile Asn Gly Pro Glu Glu Leu Tyr Gly Thr 1100 1105 1110 Tyr Glu Tyr Leu Leu Gly Met Arg Ile Trp Ser Leu Thr Lys Glu 1115 1120 1125 Arg Tyr Gln Lys Leu Leu Lys Gln Lys Gln Glu Lys Glu Thr Glu 1130 1135 1140 Leu Glu Asn Leu Leu Lys Leu Ser Ala Lys Asp Ile Trp Asn Thr 1145 1150 1155 Asp Leu Lys Ala Phe Glu Val Gly Tyr Gln Glu Phe Leu Gln Arg 1160 1165 1170 Asp Ala Glu Ala Arg Gly Gly Asn Val Pro Asn Lys Gly Ser Lys 1175 1180 1185 Thr Lys Gly Lys Gly Lys Arg Lys Leu Val Asp Asp Glu Asp Tyr 1190 1195 1200 Asp Pro Ser Lys Lys Asn Lys Lys Ser Thr Ala Arg Lys Gly Lys 1205 1210 1215 Lys Ile Lys Leu Glu Asp Lys Asn Phe Glu Arg Ile Leu Leu Glu 1220 1225 1230 Gln Lys Leu Val Thr Lys Ser Lys Ala Pro Thr Lys Ile Lys Lys 1235 1240 1245 Glu Lys Thr Pro Ser Val Ser Glu Thr Lys Thr Glu Glu Glu Glu 1250 1255 1260 Asn Ala Pro Ser Ser Thr Ser Ser Ser Ser Ile Phe Asp Ile Lys 1265 1270 1275 Lys Glu Asp Lys Asp Glu Gly Glu Leu Ser Lys Ile Ser Asn Lys 1280 1285 1290 Phe Lys Lys Ile Ser Thr Ile Phe Asp Lys Met Gly Ser Thr Ser 1295 1300 1305 Ala Thr Ser Lys Glu Asn Thr Pro Glu Gln Asp Asp Val Ala Thr 1310 1315 1320 Lys Lys Asn Gln Thr Thr Ala Lys Lys Thr Ala Val Lys Pro Lys 1325 1330 1335 Leu Ala Lys Lys Pro Val Arg Lys Gln Gln Lys Val Val Glu Leu 1340 1345 1350 Ser Gly Glu Ser Asp Leu Glu Ile Leu Asp Ser Tyr Thr Asp Arg 1355 1360 1365 Glu Asp Ser Asn Lys Asp Glu Asp Asp Ala Ile Pro Gln Arg Ser 1370 1375 1380 Arg Arg Gln Arg Ser Ser Arg Ala Ala Ser Val Pro Lys Lys Ser 1385 1390 1395 Tyr Val Glu Thr Leu Glu Leu Ser Asp Asp Ser Phe Ile Glu Asp 1400 1405 1410 Asp Glu Glu Glu Asn Gln Gly Ser Asp Val Ser Phe Asn Glu Glu 1415 1420 1425 Asp 3 430 PRT Human immunodeficiency virus type 1 3 Pro Ile Ser Pro Ile Glu Thr Val Pro Val Lys Leu Lys Pro Gly Met 1 5 10 15 Asp Gly Pro Lys Val Lys Gln Trp Pro Leu Thr Glu Glu Lys Ile Lys 20 25 30 Ala Leu Val Glu Ile Cys Thr Glu Met Glu Lys Glu Gly Lys Ile Ser 35 40 45 Lys Ile Gly Pro Glu Asn Pro Tyr Asn Thr Pro Val Phe Ala Ile Lys 50 55 60 Lys Lys Asp Ser Thr Lys Trp Arg Lys Leu Val Asp Phe Arg Glu Leu 65 70 75 80 Asn Lys Arg Thr Gln Asp Phe Trp Glu Val Gln Leu Gly Ile Pro His 85 90 95 Pro Ala Gly Leu Lys Lys Lys Lys Ser Val Thr Val Leu Asp Val Gly 100 105 110 Asp Ala Tyr Phe Ser Val Pro Leu Asp Glu Asp Phe Arg Lys Tyr Thr 115 120 125 Ala Phe Thr Ile Pro Ser Ile Asn Asn Glu Thr Pro Gly Ile Arg Tyr 130 135 140 Gln Tyr Asn Val Leu Pro Gln Gly Trp Lys Gly Ser Pro Ala Ile Phe 145 150 155 160 Gln Ser Ser Met Thr Lys Ile Leu Glu Pro Phe Lys Lys Gln Asn Pro 165 170 175 Asp Ile Val Ile Tyr Gln Tyr Met Asp Asp Leu Tyr Val Gly Ser Asp 180 185 190 Leu Glu Ile Gly Gln His Arg Thr Lys Ile Glu Glu Leu Arg Gln His 195 200 205 Leu Leu Arg Trp Gly Leu Thr Thr Pro Asp Lys Lys His Gln Lys Glu 210 215 220 Pro Pro Phe Leu Trp Met Gly Tyr Glu Leu His Pro Asp Lys Trp Thr 225 230 235 240 Val Gln Pro Ile Val Leu Pro Glu Lys Asp Ser Trp Thr Val Asn Asp 245 250 255 Ile Gln Lys Leu Val Gly Lys Leu Asn Trp Ala Ser Gln Ile Tyr Pro 260 265 270 Gly Ile Lys Val Arg Gln Leu Ser Lys Leu Leu Arg Gly Thr Lys Ala 275 280 285 Leu Thr Glu Val Ile Pro Leu Thr Glu Glu Ala Glu Leu Glu Leu Ala 290 295 300 Glu Asn Arg Glu Ile Leu Lys Glu Pro Val His Gly Val Tyr Tyr Asp 305 310 315 320 Pro Ser Lys Asp Leu Ile Ala Glu Ile Gln Lys Gln Gly Gln Gly Gln 325 330 335 Trp Thr Tyr Gln Ile Tyr Gln Glu Pro Phe Lys Asn Leu Lys Thr Gly 340 345 350 Lys Tyr Ala Arg Met Arg Gly Ala His Thr Asn Asp Val Lys Gln Leu 355 360 365 Thr Glu Ala Val Gln Lys Ile Thr Thr Glu Ser Ile Val Ile Trp Gly 370 375 380 Lys Thr Pro Lys Phe Lys Leu Pro Ile Gln Lys Glu Thr Trp Glu Thr 385 390 395 400 Trp Trp Thr Glu Tyr Trp Gln Ala Thr Trp Ile Pro Glu Trp Glu Phe 405 410 415 Val Asn Thr Pro Pro Leu Val Lys Leu Trp Tyr Gln Leu Glu 420 425 430 4 430 PRT Simian immunodeficiency virus 4 Leu Ser Asp Lys Ile Pro Ile Thr Lys Val Lys Leu Lys Pro Gly Val 1 5 10 15 Asp Gly Pro Arg Ile Lys Gln Trp Pro Leu Ser Lys Glu Lys Ile Val 20 25 30 Gly Leu Gln Lys Ile Cys Asp Arg Leu Glu Glu Glu Gly Lys Ile Ser 35 40 45 Arg Val Asp Pro Gly Asn Asn Tyr Asn Thr Pro Ile Phe Ala Ile Lys 50 55 60 Lys Lys Asp Lys Asn Glu Trp Arg Lys Leu Ile Asp Phe Arg Glu Leu 65 70 75 80 Asn Lys Leu Thr Gln Asp Phe His Glu Leu Gln Leu Gly Ile Pro His 85 90 95 Pro Ala Gly Ile Lys Lys Cys Lys Arg Ile Thr Val Leu Asp Ile Gly 100 105 110 Asp Ala Tyr Phe Ser Ile Pro Leu Asp Pro Asp Tyr Arg Pro Tyr Thr 115 120 125 Ala Phe Thr Val Pro Ser Val Asn Asn Gln Ala Pro Gly Lys Arg Tyr 130 135 140 Met Tyr Asn Val Leu Pro Gln Gly Trp Lys Gly Ser Pro Cys Ile Phe 145 150 155 160 Gln Gly Thr Val Ala Ser Leu Leu Glu Val Phe Arg Lys Asn His Pro 165 170 175 Thr Val Gln Leu Tyr Gln Tyr Met Asp Asp Leu Tyr Val Gly Ser Asp 180 185 190 Tyr Thr Ala Glu Glu His Glu Lys Ala Ile Val Glu Leu Arg Ala Leu 195 200 205 Leu Met Thr Trp Asn Leu Glu Thr Pro Glu Lys Lys Tyr Gln Lys Glu 210 215 220 Pro Pro Phe His Trp Met Gly Tyr Glu Leu His Pro Asp Lys Trp Lys 225 230 235 240 Ile Glu Lys Val Gln Leu Pro Glu Leu Ala Glu Gln Pro Thr Val Asn 245 250 255 Glu Ile Gln Lys Leu Val Gly Lys Leu Asn Trp Ala Ala Gln Leu Tyr 260 265 270 Pro Gly Ile Lys Thr Lys Gln Leu Cys Lys Leu Ile Arg Gly Gly Leu 275 280 285 Asn Ile Thr Glu Lys Val Thr Met Thr Glu Glu Ala Arg Leu Glu Tyr 290 295 300 Glu Gln Asn Lys Glu Ile Leu Ala Glu Glu Gln Glu Gly Ser Tyr Tyr 305 310 315 320 Asp Pro Asn Lys Glu Leu Tyr Val Arg Phe Gln Lys Thr Thr Gly Gly 325 330 335 Asp Ile Ser Phe Gln Trp Lys Gln Gly Asn Lys Val Leu Arg Ala Gly 340 345 350 Lys Tyr Gly Lys Gln Lys Thr Ala His Ser Asn Asp Leu Met Lys Leu 355 360 365 Ala Gly Ala Thr Gln Lys Val Gly Arg Glu Ser Ile Val Ile Trp Gly 370 375 380 Phe Val Pro Lys Met Gln Ile Pro Thr Thr Arg Glu Ile Trp Glu Asp 385 390 395 400 Trp Trp His Glu Tyr Trp Gln Cys Thr Trp Ile Pro Glu Val Glu Phe 405 410 415 Ile Ser Thr Pro Met Leu Glu Arg Glu Trp Tyr Ser Leu Ser 420 425 430 5 428 PRT Human immunodeficiency virus type 2 5 Pro Val Ala Lys Ile Glu Pro Ile Lys Ile Met Leu Lys Pro Gly Lys 1 5 10 15 Asp Gly Pro Arg Leu Arg Gln Trp Pro Leu Thr Lys Glu Lys Ile Glu 20 25 30 Ala Leu Lys Glu Ile Cys Glu Lys Met Glu Lys Glu Gly Gln Leu Glu 35 40 45 Glu Ala Pro Pro Thr Asn Pro Tyr Asn Thr Pro Thr Phe Ala Ile Arg 50 55 60 Lys Lys Asp Lys Asn Lys Trp Arg Met Leu Ile Asp Phe Arg Glu Leu 65 70 75 80 Asn Lys Val Thr Gln Asp Phe Thr Glu Ile Gln Leu Gly Ile Pro His 85 90 95 Pro Ala Gly Leu Ala Lys Lys Arg Arg Ile Thr Val Leu Asp Val Gly 100 105 110 Asp Ala Tyr Phe Ser Ile Pro Leu His Glu Asp Phe Arg Gln Tyr Thr 115 120 125 Ala Phe Thr Leu Pro Ser Val Asn Asn Ala Glu Pro Gly Lys Arg Tyr 130 135 140 Ile Tyr Lys Val Leu Pro Gln Gly Trp Lys Gly Ser Pro Ala Ile Phe 145 150 155 160 Gln Tyr Thr Met Arg Gln Val Leu Glu Pro Phe Arg Lys Ala Asn Ser 165 170 175 Asp Val Ile Ile Ile Gln Tyr Met Asp Asp Ile Leu Ile Ala Ser Asp 180 185 190 Arg Thr Asp Leu Glu His Asp Lys Val Val Leu Gln Leu Lys Glu Leu 195 200 205 Leu Asn Asn Leu Gly Phe Ser Thr Pro Asp Glu Lys Phe Gln Lys Asp 210 215 220 Pro Pro Tyr Arg Trp Met Gly Tyr Glu Leu Trp Pro Thr Lys Trp Lys 225 230 235 240 Leu Gln Lys Ile Gln Leu Pro Gln Lys Glu Val Trp Thr Val Asn Asp 245 250 255 Ile Gln Lys Leu Val Gly Val Leu Asn Trp Ala Ala Gln Ile Tyr Pro 260 265 270 Gly Ile Lys Thr Lys His Leu Cys Arg Leu Ile Arg Gly Lys Met Thr 275 280 285 Leu Thr Glu Glu Val Gln Trp Thr Glu Leu Ala Glu Ala Glu Leu Glu 290 295 300 Glu Asn Arg Ile Ile Leu Ser Gln Glu Gln Glu Gly His Tyr Tyr Gln 305 310 315 320 Glu Glu Lys Glu Leu Glu Ala Thr Val Gln Lys Asp Gln Asp Asn Gln 325 330 335 Trp Thr Tyr Lys Ile His Gln Glu Glu Lys Ile Leu Lys Val Gly Lys 340 345 350 Tyr Ala Lys Ile Lys His Thr His Thr Asn Gly Val Lys Leu Leu Ala 355 360 365 Gln Val Val Gln Lys Ile Gly Lys Glu Ala Leu Val Ile Gly Arg Ile 370 375 380 Pro Lys Phe His Leu Pro Val Glu Arg Glu Val Trp Glu Gln Trp Trp 385 390 395 400 Asp Asn Tyr Trp Gln Val Thr Trp Ile Pro Asp Trp Asp Phe Val Ser 405 410 415 Thr Pro Pro Leu Val Arg Leu Ala Phe Asn Leu Val 420 425 6 429 PRT Sooty mangabey 6 Pro Ile Ala Lys Val Glu Pro Ile Lys Val Thr Leu Lys Pro Gly Lys 1 5 10 15 Glu Gly Pro Lys Leu Arg Gln Trp Pro Leu Ser Lys Glu Lys Ile Ile 20 25 30 Ala Leu Arg Glu Ile Cys Glu Lys Met Glu Lys Asp Gly Gln Leu Glu 35 40 45 Glu Ala Pro Pro Thr Asn Pro Tyr Asn Thr Pro Thr Phe Ala Ile Lys 50 55 60 Lys Lys Asp Lys Asn Lys Trp Arg Met Leu Ile Asp Phe Arg Glu Leu 65 70 75 80 Asn Lys Val Thr Gln Asp Phe Thr Glu Val Gln Leu Gly Ile Pro His 85 90 95 Pro Ala Gly Leu Ala Lys Arg Arg Arg Ile Thr Val Leu Asp Val Gly 100 105 110 Asp Ala Tyr Phe Ser Ile Pro Leu Asp Glu Glu Phe Arg Gln Tyr Thr 115 120 125 Ala Phe Thr Leu Pro Ser Val Asn Asn Ala Glu Pro Gly Lys Arg Tyr 130 135 140 Ile Tyr Lys Val Leu Pro Gln Gly Trp Lys Gly Ser Pro Ala Ile Phe 145 150 155 160 Gln Tyr Thr Met Arg Asn Val Leu Glu Pro Phe Arg Lys Ala Asn Pro 165 170 175 Asp Val Thr Leu Ile Gln Tyr Met Asp Asp Ile Leu Ile Ala Ser Asp 180 185 190 Arg Thr Asp Leu Glu His Asp Arg Val Val Leu Gln Leu Lys Glu Leu 195 200 205 Leu Asn Gly Ile Gly Phe Ser Thr Pro Glu Glu Lys Phe Gln Lys Asp 210 215 220 Pro Pro Phe Gln Trp Met Gly Tyr Glu Leu Trp Pro Thr Lys Trp Lys 225 230 235 240 Leu Gln Lys Ile Glu Leu Pro Gln Arg Glu Thr Trp Thr Val Asn Asp 245 250 255 Ile Gln Lys Leu Val Gly Val Leu Asn Trp Ala Ala Gln Ile Tyr Pro 260 265 270 Gly Ile Lys Thr Lys His Leu Cys Arg Leu Ile Arg Gly Lys Met Thr 275 280 285 Leu Thr Glu Glu Val Gln Trp Thr Glu Met Ala Glu Ala Glu Tyr Glu 290 295 300 Glu Asn Lys Ile Ile Leu Ser Gln Glu Gln Glu Gly Cys Tyr Tyr Gln 305 310 315 320 Glu Gly Lys Pro Ile Glu Ala Thr Val Ile Lys Ser Gln Asp Asn Gln 325 330 335 Trp Ser Tyr Lys Ile His Gln Glu Asp Lys Val Leu Lys Val Gly Lys 340 345 350 Phe Ala Lys Val Lys Asn Thr His Thr Asn Gly Val Arg Leu Leu Ala 355 360 365 His Val Val Gln Lys Ile Gly Lys Glu Ala Leu Val Ile Trp Gly Glu 370 375 380 Val Pro Lys Phe His Leu Pro Val Glu Arg Glu Ile Trp Glu Gln Trp 385 390 395 400 Trp Thr Asp Tyr Trp Gln Val Thr Trp Ile Pro Asp Trp Asp Phe Val 405 410 415 Ser Thr Pro Pro Leu Val Arg Leu Val Phe Asn Leu Val 420 425 7 429 PRT Equine infectious anemia virus 7 Gln Leu Ser Lys Glu Ile Lys Phe Arg Lys Ile Glu Leu Lys Glu Gly 1 5 10 15 Thr Met Gly Pro Lys Ile Pro Gln Trp Pro Leu Thr Lys Glu Lys Leu 20 25 30 Glu Gly Ala Lys Glu Ile Val Gln Arg Leu Leu Ser Glu Gly Lys Ile 35 40 45 Ser Glu Ala Ser Asp Asn Asn Pro Tyr Asn Ser Pro Ile Phe Val Ile 50 55 60 Lys Lys Arg Ser Gly Lys Trp Arg Leu Leu Gln Asp Leu Arg Glu Leu 65 70 75 80 Asn Lys Thr Val Gln Val Gly Thr Glu Ile Ser Arg Gly Leu Pro His 85 90 95 Pro Gly Gly Leu Ile Lys Cys Lys His Met Thr Val Leu Asp Ile Gly 100 105 110 Asp Ala Tyr Phe Thr Ile Pro Leu Asp Pro Glu Phe Arg Pro Tyr Thr 115 120 125 Ala Phe Thr Ile Pro Ser Ile Asn His Gln Glu Pro Asp Lys Arg Tyr 130 135 140 Val Trp Asn Cys Leu Pro Gln Gly Phe Val Leu Ser Pro Tyr Ile Tyr 145 150 155 160 Gln Lys Thr Leu Gln Glu Ile Leu Gln Pro Phe Arg Glu Arg Tyr Pro 165 170 175 Glu Val Gln Leu Tyr Gln Tyr Met Asp Asp Leu Phe Val Gly Ser Asn 180 185 190 Gly Ser Lys Lys Gln His Lys Glu Leu Ile Ile Glu Leu Arg Ala Ile 195 200 205 Leu Leu Glu Lys Gly Phe Glu Thr Pro Asp Asp Lys Leu Gln Glu Val 210 215 220 Pro Pro Tyr Ser Trp Leu Gly Tyr Gln Leu Cys Pro Glu Asn Trp Lys 225 230 235 240 Val Gln Lys Met Gln Leu Asp Met Val Lys Asn Pro Thr Leu Asn Asp 245 250 255 Val Gln Lys Leu Met Gly Asn Ile Thr Trp Met Ser Ser Gly Val Pro 260 265 270 Gly Leu Thr Val Lys His Ile Ala Ala Thr Thr Lys Gly Cys Leu Glu 275 280 285 Leu Asn Gln Lys Val Ile Trp Thr Glu Glu Ala Gln Lys Glu Leu Glu 290 295 300 Glu Asn Asn Glu Lys Ile Lys Asn Ala Gln Gly Leu Gln Tyr Tyr Asn 305 310 315 320 Pro Glu Glu Glu Met Leu Cys Glu Val Glu Ile Thr Lys Asn Tyr Glu 325 330 335 Ala Thr Tyr Val Ile Lys Gln Ser Gln Gly Ile Leu Trp Ala Gly Lys 340 345 350 Lys Ile Met Lys Ala Asn Lys Gly Trp Ser Thr Val Lys Asn Leu Met 355 360 365 Leu Leu Leu Gln His Val Ala Thr Glu Ser Ile Thr Arg Val Gly Lys 370 375 380 Cys Pro Thr Phe Lys Val Pro Phe Thr Ile Glu Gln Val Met Trp Glu 385 390 395 400 Met Gln Lys Gly Trp Tyr Tyr Ser Trp Leu Pro Glu Ile Val Tyr Thr 405 410 415 His Gln Val Val His Asp Asp Trp Arg Met Lys Leu Val 420 425 8 431 PRT Feline immunodeficiency virus 8 Ile Ser Asp Lys Ile Pro Ile Val Lys Val Lys Met Lys Asp Pro Asn 1 5 10 15 Lys Gly Pro Gln Ile Lys Gln Trp Pro Leu Ser Asn Glu Lys Ile Glu 20 25 30 Ala Leu Thr Glu Ile Val Glu Arg Leu Glu Arg Glu Gly Lys Val Lys 35 40 45 Arg Ala Asp Pro Asn Asn Pro Trp Asn Thr Pro Val Phe Ala Ile Lys 50 55 60 Lys Lys Ser Gly Lys Trp Arg Met Leu Ile Asp Phe Arg Glu Leu Asn 65 70 75 80 Lys Leu Thr Glu Lys Gly Ala Glu Val Gln Leu Gly Leu Pro His Pro 85 90 95 Ala Gly Leu Gln Met Lys Lys Gln Ile Thr Val Leu Asp Ile Gly Asp 100 105 110 Ala Tyr Phe Thr Asn Pro Leu Asp Pro Asp Tyr Ala Pro Tyr Thr Ala 115 120 125 Phe Thr Leu Pro Arg Lys Asn Asn Ala Gly Pro Gly Arg Arg Phe Val 130 135 140 Trp Cys Ser Leu Pro Gln Gly Trp Ile Leu Ser Pro Leu Ile Tyr Gln 145 150 155 160 Ser Thr Leu Asp Asn Ile Ile Gln Pro Phe Ile Arg Gln Asn Pro Gln 165 170 175 Leu Asp Ile Tyr Gln Tyr Met Asp Asp Ile Tyr Ile Gly Ser Asn Leu 180 185 190 Ser Lys Lys Glu His Lys Glu Lys Val Glu Glu Leu Arg Lys Leu Leu 195 200 205 Leu Trp Trp Gly Phe Glu Thr Pro Glu Asp Lys Leu Gln Glu Glu Pro 210 215 220 Pro Tyr Lys Trp Met Gly Tyr Glu Leu His Pro Leu Thr Trp Thr Ile 225 230 235 240 Gln Gln Lys Gln Leu Glu Thr Pro Glu Lys Pro Thr Leu Asn Glu Leu 245 250 255 Gln Lys Leu Ala Gly Lys Ile Asn Trp Ala Ser Gln Thr Ile Pro Glu 260 265 270 Leu Ser Ile Lys Ser Leu Thr Asn Met Thr Arg Gly Asn Gln Asn Leu 275 280 285 Asn Ser Thr Arg Glu Trp Thr Glu Glu Ala Arg Leu Glu Val Gln Lys 290 295 300 Ala Lys Arg Ala Ile Glu Glu Gln Val Gln Leu Gly Tyr Tyr Asp Pro 305 310 315 320 Ser Lys Glu Leu Tyr Ala Lys Leu Ser Leu Val Gly Pro His Gln Ile 325 330 335 Ser Tyr Gln Val Tyr Gln Lys Cys Pro Glu Lys Ile Leu Trp Tyr Gly 340 345 350 Lys Met Ser Arg Gln Lys Lys Lys Ala Glu Asn Thr Cys Asp Ile Ala 355 360 365 Leu Arg Ala Cys Tyr Lys Ile Arg Glu Glu Ser Ile Ile Arg Ile Gly 370 375 380 Lys Glu Pro Arg Tyr Glu Ile Pro Thr Ser Arg Glu Ala Trp Glu Ser 385 390 395 400 Asn Leu Ile Asn Ser Pro Tyr Leu Lys Ala Pro Pro Pro Glu Val Asp 405 410 415 Tyr Ile His Ala Ala Leu Asn Ile Lys Arg Ala Leu Ser Met Ile 420 425 430 9 424 PRT Visna lentivirus 9 Leu Glu Glu Lys Lys Ile Pro Ile Thr Glu Val Arg Leu Lys Glu Gly 1 5 10 15 Cys Lys Gly Pro His Ile Ala Gln Trp Pro Leu Thr Gln Glu Lys Leu 20 25 30 Glu Gly Leu Lys Glu Ile Val Asp Arg Leu Glu Lys Glu Gly Lys Val 35 40 45 Gly Arg Ala Pro Pro His Trp Thr Cys Asn Thr Pro Ile Phe Cys Ile 50 55 60 Lys Lys Lys Ser Gly Lys Trp Arg Met Leu Ile Asp Phe Arg Glu Leu 65 70 75 80 Asn Lys Gln Thr Glu Asp Leu Ala Glu Ala Gln Leu Gly Leu Pro His 85 90 95 Pro Gly Gly Leu Gln Arg Lys Lys His Val Thr Ile Leu Asp Ile Gly 100 105 110 Asp Ala Tyr Phe Thr Ile Pro Leu Tyr Glu Pro Tyr Arg Gln Tyr Thr 115 120 125 Cys Phe Thr Met Leu Ser Pro Asn Asn Leu Gly Pro Cys Val Arg Tyr 130 135 140 Tyr Trp Lys Val Leu Pro Gln Gly Trp Ile Leu Ser Pro Ser Val Tyr 145 150 155 160 Gln Phe Thr Met Gln Lys Ile Leu Arg Gly Trp Ile Glu Glu His Pro 165 170 175 Met Ile Gln Phe Gly Ile Tyr Met Asp Asp Ile Tyr Ile Gly Ser Asp 180 185 190 Leu Gly Leu Glu Glu His Arg Gly Ile Val Asn Glu Leu Ala Ser Tyr 195 200 205 Ile Ala Gln Tyr Gly Phe Met Leu Pro Glu Asp Lys Arg Gln Glu Gly 210 215 220 Tyr Pro Ala Lys Trp Leu Gly Phe Glu Leu His Pro Glu Lys Trp Lys 225 230 235 240 Phe Gln Lys His Thr Leu Pro Glu Ile Thr Glu Gly Pro Ile Thr Leu 245 250 255 Asn Lys Leu Gln Lys Leu Val Gly Asp Leu Val Trp Arg Gln Ser Leu 260 265 270 Ile Gly Lys Ser Ile Pro Asn Ile Leu Lys Leu Met Glu Gly Asp Arg 275 280 285 Ala Leu Gln Ser Glu Arg Tyr Ile Glu Ser Ile His Val Arg Glu Trp 290 295 300 Glu Ala Cys Arg Gln Lys Leu Lys Glu Met Glu Gly Asn Tyr Tyr Asp 305 310 315 320 Glu Glu Lys Asp Ile Tyr Gly Gln Leu Asp Trp Gly Asn Lys Ala Ile 325 330 335 Glu Tyr Ile Val Phe Gln Glu Lys Gly Lys Pro Leu Trp Val Asn Val 340 345 350 Val His Ser Ile Lys Asn Leu Ser Gln Ala Gln Gln Ile Ile Lys Ala 355 360 365 Ala Gln Lys Leu Thr Gln Glu Val Ile Ile Arg Thr Gly Lys Ile Pro 370 375 380 Trp Ile Leu Leu Pro Gly Arg Glu Glu Asp Trp Ile Leu Glu Leu Gln 385 390 395 400 Met Gly Asn Ile Asn Trp Met Pro Ser Phe Trp Ser Cys Tyr Lys Gly 405 410 415 Ser Val Arg Trp Lys Lys Arg Asn 420 10 423 PRT Ovine lentivirus 10 Leu Glu Glu Lys Lys Ile Pro Ile Thr Gln Val Lys Leu Lys Glu Gly 1 5 10 15 Cys Lys Gly Pro His Ile Ala Gln Trp Pro Leu Thr Gln Glu Lys Leu 20 25 30 Glu Gly Leu Lys Glu Ile Val Asp Lys Leu Glu Lys Glu Gly Lys Val 35 40 45 Gly Arg Ala Pro Pro His Trp Thr Cys Asn Thr Pro Ile Phe Cys Ile 50 55 60 Lys Lys Lys Ser Gly Lys Trp Arg Met Leu Ile Asp Phe Arg Glu Leu 65 70 75 80 Asn Lys Gln Thr Glu Asp Leu Ala Glu Ala Gln Leu Gly Leu Pro His 85 90 95 Pro Gly Gly Leu Gln Lys Lys Lys His Val Thr Ile Leu Asp Ile Gly 100 105 110 Asp Ala Tyr Phe Thr Ile Pro Leu Tyr Glu Pro Tyr Arg Pro Tyr Thr 115 120 125 Cys Phe Thr Met Leu Ser Pro Asn Asn Leu Gly Pro Cys Thr Arg Tyr 130 135 140 Tyr Trp Lys Val Leu Pro Gln Gly Trp Lys Leu Ser Pro Ser Val Tyr 145 150 155 160 Gln Phe Thr Met Gln Glu Ile Leu Arg Asp Trp Ile Ala Lys His Pro 165 170 175 Met Ile Gln Phe Gly Ile Tyr Met Asp Asp Ile Tyr Ile Gly Ser Asp 180 185 190 Leu Asp Ile Met Lys His Arg Glu Ile Val Glu Glu Leu Ala Ser Tyr 195 200 205 Ile Ala Gln Tyr Gly Phe Met Leu Pro Glu Glu Lys Arg Gln Glu Gly 210 215 220 Tyr Pro Ala Lys Trp Leu Gly Phe Glu Leu His Pro Glu Lys Trp Arg 225 230 235 240 Phe Gln Lys His Thr Leu Pro Glu Ile Lys Glu Gly Thr Ile Thr Leu 245 250 255 Asn Lys Ile Gln Lys Leu Val Gly Asp Leu Val Trp Arg Gln Ser Leu 260 265 270 Ile Gly Lys Ser Ile Pro Asn Ile Leu Lys Leu Met Glu Gly Asp Arg 275 280 285 Ala Leu Gln Ser Glu Arg Arg Ile Glu Leu Arg His Val Lys Glu Trp 290 295 300 Glu Glu Cys Arg Arg Lys Leu Ala Glu Met Glu Gly Asn Tyr Tyr Asp 305 310 315 320 Glu Glu Lys Asp Val Tyr Gly Gln Ile Asp Trp Gly Asp Lys Ile Glu 325 330 335 Tyr Thr Val Phe Gln Glu Arg Gly Lys Pro Leu Trp Val Asn Val Val 340 345 350 His Asn Ile Lys Asn Leu Ser Gln Ser Gln Gln Ile Ile Lys Ala Ala 355 360 365 Gln Lys Leu Thr Gln Glu Val Ile Ile Arg Ile Gly Lys Ile Pro Trp 370 375 380 Ile Leu Leu Pro Gly Lys Glu Glu Asp Trp Ile Leu Glu Leu Gln Ile 385 390 395 400 Gly Asn Ile Thr Trp Met Pro Ser Phe Trp Ser Cys Tyr Arg Gly Ser 405 410 415 Ile Arg Trp Lys Lys Arg Asn 420 11 424 PRT Caprine arthritis encephalitis virus 11 Leu Glu Glu Lys Arg Ile Pro Ile Thr Lys Val Lys Leu Lys Glu Gly 1 5 10 15 Cys Thr Gly Pro His Val Pro Gln Trp Pro Leu Thr Glu Glu Lys Leu 20 25 30 Lys Gly Leu Thr Glu Ile Ile Asp Lys Leu Val Glu Glu Gly Lys Leu 35 40 45 Gly Lys Ala Pro Pro His Trp Thr Cys Asn Thr Pro Ile Phe Cys Ile 50 55 60 Lys Lys Lys Ser Gly Lys Trp Arg Met Leu Ile Asp Phe Arg Glu Leu 65 70 75 80 Asn Lys Gln Thr Glu Asp Leu Thr Glu Ala Gln Leu Gly Leu Pro His 85 90 95 Pro Gly Gly Leu Gln Lys Lys Lys His Val Thr Ile Leu Asp Ile Gly 100 105 110 Asp Ala Tyr Phe Thr Ile Pro Leu Tyr Glu Pro Tyr Arg Glu Tyr Thr 115 120 125 Cys Phe Thr Leu Leu Ser Pro Asn Asn Leu Gly Pro Cys Lys Arg Tyr 130 135 140 Tyr Trp Lys Val Leu Pro Gln Gly Trp Lys Leu Ser Pro Ser Val Tyr 145 150 155 160 Gln Phe Thr Met Gln Glu Ile Leu Glu Asp Trp Ile Gln Gln His Pro 165 170 175 Glu Ile Gln Phe Gly Ile Tyr Met Asp Asp Ile Tyr Ile Gly Ser Asp 180 185 190 Leu Glu Ile Lys Lys His Arg Glu Ile Val Lys Asp Leu Ala Asn Tyr 195 200 205 Ile Ala Gln Tyr Gly Phe Thr Leu Pro Glu Glu Lys Arg Gln Lys Gly 210 215 220 Tyr Pro Ala Lys Trp Leu Gly Phe Glu Leu His Pro Gln Thr Trp Lys 225 230 235 240 Phe Gln Lys His Thr Leu Pro Glu Leu Thr Lys Gly Thr Ile Thr Leu 245 250 255 Asn Lys Leu Gln Lys Leu Val Gly Glu Leu Val Trp Arg Gln Ser Ile 260 265 270 Ile Gly Lys Ser Ile Pro Asn Ile Leu Lys Ile Leu Glu Gly Asp Arg 275 280 285 Glu Leu Gln Ser Glu Arg Lys Ile Glu Glu Val His Val Lys Glu Trp 290 295 300 Glu Ala Cys Arg Lys Lys Leu Glu Glu Met Glu Gly Asn Tyr Tyr Asn 305 310 315 320 Lys Asp Lys Asp Val Tyr Gly Gln Leu Ala Trp Gly Asp Lys Ala Ile 325 330 335 Glu Tyr Thr Val Tyr Gln Glu Lys Gly Lys Pro Leu Trp Val Asn Val 340 345 350 Val His Asn Ile Lys Asn Leu Ser Ile Pro Gln Gln Val Ile Lys Ala 355 360 365 Ala Gln Lys Leu Thr Gln Glu Val Ile Ile Arg Thr Gly Lys Ile Pro 370 375 380 Trp Ile Leu Leu Pro Gly Lys Glu Glu Asp Trp Arg Leu Glu Leu Gln 385 390 395 400 Leu Gly Asn Ile Thr Trp Met Pro Lys Phe Trp Ser Cys Tyr Arg Gly 405 410 415 His Thr Arg Trp Arg Lys Arg Asn 420 12 418 PRT Bovine immunodeficiency virus 12 Val His Thr Glu Lys Ile Glu Pro Leu Pro Val Lys Val Arg Gly Pro 1 5 10 15 Gly Pro Lys Val Pro Gln Trp Pro Leu Thr Lys Glu Lys Tyr Gln Ala 20 25 30 Leu Lys Glu Ile Val Lys Asp Leu Leu Ala Glu Gly Lys Ile Ser Glu 35 40 45 Ala Ala Trp Asp Asn Pro Tyr Asn Thr Pro Val Phe Val Ile Lys Lys 50 55 60 Lys Gly Thr Gly Arg Trp Arg Met Leu Met Asp Phe Arg Glu Leu Asn 65 70 75 80 Lys Ile Thr Val Lys Gly Gln Glu Phe Ser Thr Gly Leu Pro Tyr Pro 85 90 95 Pro Gly Ile Lys Glu Cys Glu His Leu Thr Ala Ile Asp Ile Lys Asp 100 105 110 Ala Tyr Phe Thr Ile Pro Leu His Glu Asp Phe Arg Pro Phe Thr Ala 115 120 125 Phe Ser Val Val Pro Val Asn Arg Glu Gly Pro Ile Glu Arg Phe Gln 130 135 140 Trp Asn Val Leu Pro Gln Gly Trp Val Cys Ser Pro Ala Ile Tyr Gln 145 150 155 160 Thr Thr Thr Gln Lys Ile Ile Glu Asn Ile Lys Lys Ser His Pro Asp 165 170 175 Val Met Leu Tyr Gln Tyr Met Asp Asp Leu Leu Ile Gly Ser Asn Arg 180 185 190 Asp Asp His Lys Gln Ile Val Gln Glu Ile Arg Asp Lys Leu Gly Ser 195 200 205 Tyr Gly Phe Lys Thr Pro Asp Glu Lys Val Gln Glu Glu Arg Val Lys 210 215 220 Trp Ile Gly Phe Glu Leu Thr Pro Lys Lys Trp Arg Phe Gln Pro Arg 225 230 235 240 Gln Leu Lys Ile Lys Asn Pro Leu Thr Val Asn Glu Leu Gln Gln Leu 245 250 255 Val Gly Asn Cys Val Trp Val Gln Pro Glu Val Lys Ile Pro Leu Tyr 260 265 270 Pro Leu Thr Asp Leu Leu Arg Asp Lys Thr Asn Leu Gln Glu Lys Ile 275 280 285 Gln Leu Thr Pro Glu Ala Ile Lys Cys Val Glu Glu Phe Asn Leu Lys 290 295 300 Leu Lys Asp Pro Glu Trp Lys Asp Arg Ile Arg Glu Gly Ala Glu Leu 305 310 315 320 Val Ile Lys Ile Gln Met Val Pro Arg Gly Ile Val Phe Asp Leu Leu 325 330 335 Gln Asp Gly Asn Pro Ile Trp Gly Gly Val Lys Gly Leu Asn Tyr Asp 340 345 350 His Ser Asn Lys Ile Lys Lys Ile Leu Arg Thr Met Asn Glu Leu Asn 355 360 365 Arg Thr Val Val Ile Met Thr Gly Arg Glu Ala Ser Phe Leu Leu Pro 370 375 380 Gly Ser Ser Glu Asp Trp Glu Ala Ala Leu Gln Lys Glu Glu Ser Leu 385 390 395 400 Thr Gln Ile Phe Pro Val Lys Phe Tyr Arg His Ser Cys Arg Trp Thr 405 410 415 Ser Ile 13 440 PRT Mason-Pfizer monkey virus 13 Ile Leu Ala Pro Gln Gln Cys Ala Glu Pro Ile Thr Trp Lys Ser Asp 1 5 10 15 Glu Pro Val Trp Val Asp Gln Trp Pro Leu Thr Asn Asp Lys Leu Ala 20 25 30 Ala Ala Gln Gln Leu Val Gln Glu Gln Leu Glu Ala Gly His Ile Thr 35 40 45 Glu Ser Ser Ser Pro Trp Asn Thr Pro Ile Phe Val Ile Lys Lys Lys 50 55 60 Ser Gly Lys Trp Arg Leu Leu Gln Asp Leu Arg Ala Val Asn Ala Thr 65 70 75 80 Met Val Leu Met Gly Ala Leu Gln Pro Gly Leu Pro Ser Pro Val Ala 85 90 95 Ile Pro Gln Gly Tyr Leu Lys Ile Ile Ile Asp Leu Lys Asp Cys Phe 100 105 110 Phe Ser Ile Pro Leu His Pro Ser Asp Gln Lys Arg Phe Ala Phe Ser 115 120 125 Leu Pro Ser Thr Asn Phe Lys Glu Pro Met Gln Arg Phe Gln Trp Lys 130 135 140 Val Leu Pro Gln Gly Met Ala Asn Ser Pro Thr Leu Cys Gln Lys Tyr 145 150 155 160 Val Ala Thr Ala Ile His Lys Val Arg His Ala Trp Lys Gln Met Tyr 165 170 175 Ile Ile His Tyr Met Asp Asp Ile Leu Ile Ala Gly Lys Asp Gly Gln 180 185 190 Gln Val Leu Gln Cys Phe Asp Gln Leu Lys Gln Glu Leu Thr Ala Ala 195 200 205 Gly Leu His Ile Ala Pro Glu Lys Val Gln Leu Gln Asp Pro Tyr Thr 210 215 220 Tyr Leu Gly Phe Glu Leu Asn Gly Pro Lys Ile Thr Asn Gln Lys Ala 225 230 235 240 Val Ile Arg Lys Asp Lys Leu Gln Thr Leu Asn Asp Phe Gln Lys Leu 245 250 255 Leu Gly Asp Ile Asn Trp Leu Arg Pro Tyr Leu Lys Leu Thr Thr Gly 260 265 270 Asp Leu Lys Pro Leu Phe Asp Thr Leu Lys Gly Asp Ser Asp Pro Asn 275 280 285 Ser His Arg Ser Leu Ser Lys Glu Ala Leu Ala Ser Leu Glu Lys Val 290 295 300 Glu Thr Ala Ile Ala Glu Gln Phe Val Thr His Ile Asn Tyr Ser Leu 305 310 315 320 Pro Leu Ile Phe Leu Ile Phe Asn Thr Ala Leu Thr Pro Thr Gly Leu 325 330 335 Glu Trp Gln Asp Asn Pro Ile Met Trp Ile His Leu Pro Ala Ser Pro 340 345 350 Lys Lys Val Leu Leu Pro Tyr Tyr Asp Ala Ile Ala Asp Leu Ile Ile 355 360 365 Leu Gly Arg Asp His Ser Lys Ile Tyr Phe Gly Ile Glu Pro Ser Thr 370 375 380 Ile Ile Gln Pro Tyr Ser Lys Ser Gln Ile Asp Trp Leu Met Gln Asn 385 390 395 400 Thr Glu Met Trp Pro Ile Ala Cys Ala Ser Phe Val Gly Ile Leu Asp 405 410 415 Asn His Tyr Pro Pro Asn Lys Leu Ile Gln Phe Cys Lys Leu His Thr 420 425 430 Phe Val Phe Pro Gln Ile Ile Ser 435 440 14 440 PRT Simian retrovirus SRV-1 14 Met Leu Ala Pro Gln Gln Cys Ala Glu Pro Ile Thr Trp Lys Ser Asp 1 5 10 15 Glu Pro Val Trp Val Asp Gln Trp Pro Leu Thr Ser Glu Lys Leu Ala 20 25 30 Ala Ala Gln Gln Leu Val Gln Glu Gln Leu Glu Ala Gly His Ile Thr 35 40 45 Glu Ser Asn Ser Pro Trp Asn Thr Pro Ile Phe Val Ile Lys Lys Lys 50 55 60 Ser Gly Lys Trp Arg Leu Leu Gln Asp Leu Arg Ala Val Asn Ala Thr 65 70 75 80 Met Val Leu Met Gly Ala Leu Gln Pro Gly Leu Pro Ser Pro Val Ala 85 90 95 Ile Pro Gln Gly Tyr Leu Lys Ile Ile Ile Asp Leu Lys Asp Cys Phe 100 105 110 Phe Ser Ile Pro Leu His Pro Ser Asp Gln Lys Arg Phe Ala Phe Ser 115 120 125 Leu Pro Ser Thr Asn Phe Lys Glu Pro Met Gln Arg Phe Gln Trp Lys 130 135 140 Val Leu Pro Gln Arg Met Ala Asn Ser Pro Thr Leu Cys Gln Lys Tyr 145 150 155 160 Val Ala Thr Ala Ile His Lys Val Arg His Ala Trp Lys Gln Met Tyr 165 170 175 Ile Ile His Tyr Met Asp Asp Ile Leu Ile Ala Gly Lys Asp Gly Gln 180 185 190 Gln Val Leu Gln Cys Phe Asp Gln Leu Lys Gln Glu Leu Thr Ile Ala 195 200 205 Gly Leu His Ile Ala Pro Glu Lys Ile Gln Leu Gln Asp Pro Tyr Thr 210 215 220 Tyr Leu Gly Phe Glu Leu Asn Gly Pro Lys Ile Thr Asn Gln Lys Ala 225 230 235 240 Val Ile Arg Lys Asp Lys Leu Gln Thr Leu Asn Asp Phe Gln Lys Leu 245 250 255 Leu Gly Asp Ile Asn Trp Leu Arg Pro Tyr Leu Lys Leu Thr Thr Ala 260 265 270 Asp Leu Lys Pro Leu Phe Asp Thr Leu Lys Gly Asp Ser Asn Pro Asn 275 280 285 Ser His Arg Ser Leu Ser Lys Glu Ala Ile Ala Leu Leu Asp Lys Val 290 295 300 Glu Thr Ala Ile Ala Glu Gln Phe Val Thr His Ile Asn Tyr Ser Leu 305 310 315 320 Pro Leu Met Phe Leu Ile Phe Asn Thr Ala Leu Thr Pro Thr Gly Leu 325 330 335 Phe Trp Gln Asn Asn Pro Ile Met Trp Val His Leu Pro Ala Ser Pro 340 345 350 Lys Lys Val Leu Leu Pro Tyr Tyr Asp Ala Ile Ala Asp Leu Ile Ile 355 360 365 Leu Gly Arg Asp His Ser Lys Lys Tyr Phe Gly Ile Glu Pro Ser Val 370 375 380 Ile Ile Pro Pro Tyr Ser Lys Ser Gln Ile Asp Trp Leu Met Gln Asn 385 390 395 400 Thr Glu Met Trp Pro Ile Ala Cys Ala Ser Phe Val Gly Ile Leu Asp 405 410 415 Asn His Tyr Pro Pro Asn Lys Leu Ile Gln Phe Cys Lys Leu His Ala 420 425 430 Phe Ile Phe Pro Gln Ile Ile Ser 435 440 15 440 PRT Simian retrovirus SRV-2 15 Ile Leu Ala Pro Gln Arg Tyr Ala Asp Pro Ile Thr Trp Lys Ser Asp 1 5 10 15 Glu Pro Val Trp Val Asp Gln Trp Pro Leu Thr Gln Glu Lys Leu Ala 20 25 30 Ala Ala Gln Gln Leu Val Gln Glu Gln Leu Gln Ala Gly His Ile Ile 35 40 45 Glu Ser Asn Ser Pro Trp Asn Thr Pro Ile Phe Val Ile Lys Lys Lys 50 55 60 Ser Gly Lys Trp Arg Leu Leu Gln Asp Leu Arg Ala Val Asn Ala Thr 65 70 75 80 Met Val Leu Met Gly Ala Leu Gln Pro Gly Leu Pro Ser Pro Val Ala 85 90 95 Ile Pro Gln Gly Tyr Phe Lys Ile Val Ile Asp Leu Lys Asp Cys Phe 100 105 110 Phe Thr Ile Pro Leu Gln Pro Val Asp Gln Lys Arg Phe Ala Phe Ser 115 120 125 Leu Pro Ser Thr Asn Phe Lys Gln Pro Met Lys Arg Tyr Gln Trp Lys 130 135 140 Val Leu Pro Gln Gly Met Ala Asn Ser Pro Thr Leu Cys Gln Lys Tyr 145 150 155 160 Val Ala Ala Ala Ile Glu Pro Val Arg Lys Ser Trp Ala Gln Met Tyr 165 170 175 Ile Ile His Tyr Met Asp Asp Ile Leu Ile Ala Gly Lys Leu Gly Glu 180 185 190 Gln Val Leu Gln Cys Phe Ala Gln Leu Lys Gln Ala Leu Thr Thr Thr 195 200 205 Gly Leu Gln Ile Ala Pro Glu Lys Val Gln Leu Gln Asp Pro Tyr Thr 210 215 220 Tyr Leu Gly Phe Gln Leu Asn Gly Pro Lys Ile Thr Asn Gln Lys Ala 225 230 235 240 Val Ile Arg Arg Asp Lys Leu Gln Thr Leu Asn Asp Phe Gln Lys Leu 245 250 255 Leu Gly Asp Ile Asn Trp Leu Arg Pro Tyr Leu His Leu Thr Thr Gly 260 265 270 Asp Leu Lys Pro Leu Phe Asp Ile Leu Lys Gly Asp Ser Asn Pro Asn 275 280 285 Ser Pro Arg Ser Leu Ser Glu Ala Ala Leu Ala Ser Leu Gln Lys Val 290 295 300 Glu Thr Ala Ile Ala Glu Gln Phe Val Thr Gln Ile Asp Tyr Thr Gln 305 310 315 320 Pro Leu Thr Phe Leu Ile Phe Asn Thr Thr Leu Thr Pro Thr Gly Leu 325 330 335 Phe Trp Gln Asn Asn Pro Val Met Trp Val His Leu Pro Ala Ser Pro 340 345 350 Lys Lys Val Leu Leu Pro Tyr Tyr Asp Ala Ile Ala Asp Leu Ile Ile 355 360 365 Leu Gly Arg Asp Asn Ser Lys Lys Tyr Phe Gly Leu Glu Pro Ser Thr 370 375 380 Ile Ile Pro Pro Tyr Ser Lys Ser Gln Ile His Trp Leu Met Gln Asn 385 390 395 400 Thr Glu Thr Trp Pro Ile Ala Cys Ala Ser Tyr Ala Gly Asn Ile Asp 405 410 415 Asn His Tyr Pro Pro Asn Lys Leu Ile Gln Phe Cys Lys Leu His Ala 420 425 430 Val Val Phe Pro Arg Ile Ile Ser 435 440 16 447 PRT Squirrel monkey retrovirus 16 Ile Pro Val Pro His Ala Asp Lys Ile Ser Trp Lys Ile Thr Asp Pro 1 5 10 15 Val Trp Val Asp Gln Trp Pro Leu Thr Tyr Glu Lys Thr Leu Ala Ala 20 25 30 Ile Ala Leu Val Gln Glu Gln Leu Ala Ala Gly His Ile Glu Pro Thr 35 40 45 Asn Ser Pro Trp Asn Thr Pro Ile Phe Ile Ile Lys Lys Lys Ser Gly 50 55 60 Ser Trp Arg Leu Leu Gln Asp Leu Arg Ala Val Asn Lys Val Met Val 65 70 75 80 Pro Met Gly Ala Leu Gln Pro Gly Leu Pro Ser Pro Val Ala Ile Pro 85 90 95 Leu Asn Tyr His Lys Ile Val Ile Asp Leu Lys Asp Cys Phe Phe Thr 100 105 110 Ile Pro Leu His Pro Glu Asp Arg Pro Tyr Phe Ala Phe Ser Val Pro 115 120 125 Gln Ile Asn Phe Gln Ser Pro Met Pro Arg Tyr Gln Trp Lys Val Leu 130 135 140 Pro Gln Gly Met Ala Asn Ser Pro Thr Leu Cys Gln Lys Phe Val Ala 145 150 155 160 Ala Ala Ile Ala Pro Val Arg Ser Gln Trp Pro Glu Ala Tyr Ile Leu 165 170 175 His Tyr Met Asp Asp Ile Leu Leu Ala Cys Asp Ser Ala Glu Ala Ala 180 185 190 Lys Ala Cys Tyr Ala His Ile Ile Ser Cys Leu Thr Ser Tyr Gly Leu 195 200 205 Lys Ile Ala Pro Asp Lys Val Gln Val Ser Glu Pro Phe Ser Tyr Leu 210 215 220 Gly Phe Glu Leu His His Gln Gln Val Phe Thr Pro Arg Val Cys Leu 225 230 235 240 Lys Thr Asp His Leu Lys Thr Leu Asn Asp Phe Gln Lys Leu Leu Gly 245 250 255 Asp Ile Gln Trp Leu Arg Pro Tyr Leu Lys Leu Pro Thr Ser Ala Leu 260 265 270 Val Pro Leu Asn Asn Ile Leu Lys Gly Asp Pro Asn Pro Leu Ser Val 275 280 285 Lys Ala Leu Thr Pro Glu Ala Lys Gln Ser Leu Ala Leu Ile Asn Lys 290 295 300 Ala Ile Gln Asn Gln Ser Val Gln Gln Ile Ser Tyr Asn Leu Pro Leu 305 310 315 320 Val Leu Leu Leu Leu Pro Thr Pro His Thr Pro Thr Ala Val Phe Trp 325 330 335 Gln Pro Asn Gly Thr Asp Pro Thr Lys Asn Gly Ser Pro Leu Leu Trp 340 345 350 Leu His Leu Pro Ala Ser Pro Ser Lys Val Leu Leu Thr Tyr Pro Ser 355 360 365 Leu Leu Ala Met Leu Ile Ile Lys Gly Arg Tyr Thr Gly Arg Gln Leu 370 375 380 Phe Gly Arg Asp Pro His Ser Ile Ile Pro Pro Tyr Thr Gln Asp Gln 385 390 395 400 Leu Thr Trp Leu Leu Gln Thr Ser Asp Glu Trp Ala Ile Ala Leu Ser 405 410 415 Ser Phe Thr Gly Asp Ile Asp Asn His Tyr Pro Ser Asp Pro Val Ile 420 425 430 Gln Phe Ala Lys Leu His Gln Phe Ile Phe Pro Lys Ile Thr Lys 435 440 445 17 441 PRT Ovine pulmonary adenocarcinoma virus 17 Ser Asp Ser Pro Val Thr His Ala Asp Pro Ile Asp Trp Lys Ser Glu 1 5 10 15 Glu Pro Val Trp Val Asp Gln Trp Pro Leu Thr Gln Glu Lys Leu Ser 20 25 30 Ala Ala Gln Gln Leu Val Gln Glu Gln Leu Arg Leu Gly His Ile Glu 35 40 45 Pro Ser Thr Ser Ala Trp Asn Ser Pro Ile Phe Val Ile Lys Lys Lys 50 55 60 Ser Gly Lys Trp Arg Leu Leu Gln Asp Leu Arg Lys Val Asn Glu Thr 65 70 75 80 Met Met His Met Gly Ala Leu Gln Pro Gly Leu Pro Thr Pro Ser Ala 85 90 95 Ile Pro Asp Lys Ser Tyr Ile Ile Val Ile Asp Leu Lys Asp Cys Phe 100 105 110 Tyr Thr Ile Pro Leu Ala Pro Gln Asp Cys Lys Arg Phe Ala Phe Ser 115 120 125 Leu Pro Ser Val Asn Phe Lys Glu Pro Met Gln Arg Tyr Gln Trp Arg 130 135 140 Val Leu Pro Gln Gly Met Thr Asn Ser Pro Thr Leu Cys Gln Lys Phe 145 150 155 160 Val Ala Thr Ala Ile Ala Pro Val Arg Gln Arg Phe Pro Gln Leu Tyr 165 170 175 Leu Val His Tyr Met Asp Asp Ile Leu Leu Ala His Thr Asp Glu His 180 185 190 Leu Leu Tyr Gln Ala Phe Ser Ile Leu Lys Gln His Leu Ser Leu Asn 195 200 205 Gly Leu Val Ile Ala Asp Glu Lys Ile Gln Thr His Phe Pro Tyr Asn 210 215 220 Tyr Leu Gly Phe Ser Leu Tyr Pro Arg Val Tyr Asn Thr Gln Leu Val 225 230 235 240 Lys Leu Gln Thr Asp His Leu Lys Thr Leu Asn Asp Phe Gln Lys Leu 245 250 255 Leu Gly Asp Ile Asn Trp Ile Arg Pro Tyr Leu Lys Leu Pro Thr Tyr 260 265 270 Thr Leu Gln Pro Leu Phe Asp Ile Leu Lys Gly Asp Ser Asp Pro Ala 275 280 285 Ser Pro Arg Thr Leu Ser Leu Glu Gly Arg Thr Ala Leu Gln Ser Ile 290 295 300 Glu Glu Ala Ile Arg Gln Gln Gln Ile Thr Tyr Cys Asp Tyr Gln Arg 305 310 315 320 Ser Trp Gly Leu Tyr Ile Leu Pro Thr Pro Arg Ala Pro Thr Gly Val 325 330 335 Leu Tyr Gln Asp Lys Asn Pro Leu Arg Trp Ile Tyr Leu Ser Ala Thr 340 345 350 Pro Thr Lys His Leu Leu Pro Tyr Tyr Glu Leu Val Ala Lys Ile Ile 355 360 365 Ala Lys Gly Arg His Glu Ala Ile Gln Tyr Phe Gly Met Glu Pro Pro 370 375 380 Phe Ile Cys Val Pro Tyr Ala Leu Glu Gln Gln Asp Trp Leu Phe Gln 385 390 395 400 Phe Ser Asp Asn Trp Ser Ile Ala Phe Ala Asn Tyr Pro Gly Gln Ile 405 410 415 Thr His His Tyr Pro Ser Asp Lys Leu Leu Gln Phe Ala Ser Ser His 420 425 430 Ala Phe Ile Phe Pro Lys Ile Val Arg 435 440 18 440 PRT Mouse mammary tumor virus 18 Ala Ile Glu Ser Asn Leu Phe Ala Asp Gln Ile Ser Trp Lys Ser Asp 1 5 10 15 Gln Pro Val Trp Leu Asn Gln Trp Pro Leu Lys Gln Glu Lys Leu Gln 20 25 30 Ala Leu Gln Gln Leu Val Thr Glu Gln Leu Gln Leu Gly His Leu Glu 35 40 45 Glu Ser Asn Ser Pro Trp Asn Thr Pro Val Phe Val Ile Lys Lys Lys 50 55 60 Ser Gly Lys Trp Arg Leu Leu Gln Asp Leu Arg Ala Val Asn Ala Thr 65 70 75 80 Met His Asp Met Gly Ala Leu Gln Pro Gly Leu Pro Ser Pro Val Ala 85 90 95 Val Pro Lys Gly Trp Glu Ile Ile Ile Ile Asp Leu Gln Asp Cys Phe 100 105 110 Phe Asn Ile Lys Leu His Pro Glu Asp Cys Lys Arg Phe Ala Phe Ser 115 120 125 Val Pro Ser Pro Asn Phe Lys Arg Pro Tyr Gln Arg Phe Gln Trp Lys 130 135 140 Val Leu Pro Gln Gly Met Lys Asn Ser Pro Thr Leu Cys Gln Lys Phe 145 150 155 160 Val Asp Lys Ala Ile Leu Thr Val Arg Asp Lys Tyr Gln Asp Ser Tyr 165 170 175 Ile Val His Tyr Met Asp Asp Ile Leu Leu Ala His Pro Ser Arg Ser 180 185 190 Ile Val Asp Glu Ile Leu Thr Ser Met Ile Gln Ala Leu Asn Lys His 195 200 205 Gly Leu Val Val Ser Thr Glu Lys Ile Gln Lys Tyr Asp Asn Leu Lys 210 215 220 Tyr Leu Gly Thr His Ile Gln Gly Asp Ser Val Ser Tyr Gln Lys Leu 225 230 235 240 Gln Ile Arg Thr Asp Lys Leu Arg Thr Leu Asn Asp Phe Gln Lys Leu 245 250 255 Leu Gly Asn Ile Asn Trp Ile Arg Pro Phe Leu Lys Leu Thr Thr Gly 260 265 270 Glu Leu Lys Pro Leu Phe Glu Ile Leu Asn Gly Asp Ser Asn Pro Ile 275 280 285 Ser Thr Arg Lys Leu Thr Pro Glu Ala Cys Lys Ala Leu Gln Leu Met 290 295 300 Asn Glu Arg Leu Ser Thr Ala Arg Val Lys Arg Leu Asp Leu Ser Gln 305 310 315 320 Pro Trp Ser Leu Cys Ile Leu Lys Thr Glu Tyr Thr Pro Thr Ala Cys 325 330 335 Leu Trp Gln Asp Gly Val Val Glu Trp Ile His Leu Pro His Ile Ser 340 345 350 Pro Lys Val Ile Thr Pro Tyr Asp Ile Phe Cys Thr Gln Leu Ile Ile 355 360 365 Lys Gly Arg His Arg Ser Lys Glu Leu Phe Ser Lys Asp Pro Asp Tyr 370 375 380 Ile Val Val Pro Tyr Thr Lys Val Gln Phe Asp Leu Leu Leu Gln Glu 385 390 395 400 Lys Glu Asp Trp Pro Ile Ser Leu Leu Gly Phe Leu Gly Glu Val His 405 410 415 Phe His Leu Pro Lys Asp Pro Leu Leu Thr Phe Thr Leu Gln Thr Ala 420 425 430 Ile Ile Phe Pro His Met Thr Ser 435 440 19 441 PRT Human endogenous retrovirus 19 Val Thr Val Glu Pro Pro Lys Pro Ile Pro Leu Thr Trp Lys Thr Glu 1 5 10 15 Lys Pro Val Trp Val Asn Gln Trp Pro Leu Pro Lys Gln Lys Leu Glu 20 25 30 Ala Leu His Leu Leu Ala Asn Glu Gln Leu Glu Lys Gly His Ile Glu 35 40 45 Pro Ser Phe Ser Pro Trp Asn Ser Pro Val Phe Val Ile Gln Lys Lys 50 55 60 Ser Gly Lys Trp His Thr Leu Thr Asp Leu Arg Ala Val Asn Ala Val 65 70 75 80 Ile Gln Pro Met Gly Pro Leu Gln Pro Gly Leu Pro Ser Pro Ala Met 85 90 95 Ile Pro Lys Asp Trp Pro Leu Ile Ile Ile Asp Leu Lys Asp Cys Phe 100 105 110 Phe Thr Ile Pro Leu Ala Glu Gln Asp Cys Glu Lys Phe Ala Phe Thr 115 120 125 Ile Pro Ala Ile Asn Asn Lys Glu Pro Ala Thr Arg Phe Gln Trp Lys 130 135 140 Val Leu Pro Gln Gly Met Leu Asn Ser Pro Thr Ile Cys Gln Thr Phe 145 150 155 160 Val Gly Arg Ala Leu Gln Pro Val Arg Glu Lys Phe Ser Asp Cys Tyr 165 170 175 Ile Ile His Tyr Ile Asp Asp Ile Leu Cys Ala Ala Glu Thr Lys Asp 180 185 190 Lys Leu Ile Asp Cys Tyr Thr Phe Leu Gln Ala Glu Val Ala Asn Ala 195 200 205 Gly Leu Ala Ile Ala Ser Asp Lys Ile Gln Thr Ser Thr Pro Phe His 210 215 220 Tyr Leu Gly Met Gln Ile Glu Asn Arg Lys Ile Lys Pro Gln Lys Ile 225 230 235 240 Glu Ile Arg Lys Asp Thr Leu Lys Thr Leu Asn Asp Phe Gln Lys Leu 245 250 255 Leu Gly Asp Ile Asn Trp Ile Arg Pro Thr Leu Gly Ile Pro Thr Tyr 260 265 270 Ala Met Ser Asn Leu Phe Ser Ile Leu Arg Gly Asp Ser Asp Leu Asn 275 280 285 Ser Gln Arg Ile Leu Thr Pro Glu Ala Thr Lys Glu Ile Lys Leu Val 290 295 300 Glu Glu Lys Ile Gln Ser Ala Gln Ile Asn Arg Ile Asp Pro Leu Ala 305 310 315 320 Pro Leu Gln Leu Leu Ile Phe Ala Thr Ala His Ser Pro Thr Gly Ile 325 330 335 Ile Ile Gln Asn Thr Asp Leu Val Glu Trp Ser Phe Leu Pro His Ser 340 345 350 Thr Val Lys Thr Phe Thr Leu Tyr Leu Asp Gln Ile Ala Thr Leu Ile 355 360 365 Gly Gln Thr Arg Leu Arg Ile Thr Lys Leu Cys Gly Asn Asp Pro Asp 370 375 380 Lys Ile Val Val Pro Leu Thr Lys Glu Gln Val Arg Gln Ala Phe Ile 385 390 395 400 Asn Ser Gly Ala Trp Gln Ile Gly Leu Ala Asn Phe Val Gly Leu Ile 405 410 415 Asp Asn His Tyr Pro Lys Thr Lys Ile Phe Gln Phe Leu Lys Leu Thr 420 425 430 Thr Trp Ile Leu Pro Lys Ile Thr Arg 435 440 20 440 PRT Mouse intracisternal A-particle 20 Ser Leu Ala Ala Ile Gly Ala Ala Arg Pro Ile Pro Trp Lys Thr Gly 1 5 10 15 Asp Pro Val Trp Val Pro Gln Trp His Leu Ser Ser Glu Lys Leu Glu 20 25 30 Ala Val Ile Gln Leu Val Glu Glu Gln Leu Lys Leu Gly His Ile Asp 35 40 45 Pro Ser Thr Ser Pro Trp Asn Thr Pro Ile Phe Val Ile Lys Lys Lys 50 55 60 Ser Gly Lys Trp Arg Leu Leu His Asp Leu Arg Pro Ile Asn Glu Gln 65 70 75 80 Met Asn Leu Phe Gly Pro Val Gln Arg Gly Leu Pro Val Leu Ser Ala 85 90 95 Leu Pro Arg Gly Trp Asn Leu Ile Ile Ile Asp Ile Lys Asp Cys Phe 100 105 110 Phe Ser Ile Pro Leu Cys Pro Arg Asp Arg Pro Arg Phe Ala Phe Thr 115 120 125 Ile Pro Ser Ile Asn Ser Asp Glu Pro Asp Asn Arg Tyr Gln Trp Lys 130 135 140 Val Leu Pro Gln Gly Met Ser Asn Ser Pro Thr Met Cys Gln Leu Tyr 145 150 155 160 Val Gln Glu Ala Leu Leu Pro Val Arg Glu Gln Phe Pro Ser Leu Ile 165 170 175 Leu Leu Leu Tyr Met Asp Asp Ile Leu Leu Cys His Lys Glu Leu Thr 180 185 190 Met Leu Gln Lys Ala Tyr Pro Phe Leu Leu Lys Thr Leu Ser Gln Trp 195 200 205 Gly Leu Gln Ile Ala Thr Glu Lys Val Gln Ile Ser Asp Thr Gly Gln 210 215 220 Phe Leu Gly Ser Val Val Ser Pro Asp Lys Ile Val Pro Gln Lys Val 225 230 235 240 Glu Ile Arg Arg Asp His Leu His Thr Leu Asn Asn Phe Gln Lys Leu 245 250 255 Leu Gly Asp Ile Asn Trp Leu Arg Pro Phe Leu Lys Ile Pro Ser Ala 260 265 270 Glu Leu Arg Pro Leu Phe Trp Tyr Leu Glu Gly Asp Pro His Ile Ser 275 280 285 Ser Pro Arg Thr Leu Thr Leu Ala Ala Asn Gln Ala Leu Gln Lys Val 290 295 300 Glu Lys Ala Leu Gln Asn Ala Gln Leu Gln Ala Ile Glu Asp Ser Gln 305 310 315 320 Pro Phe Ser Leu Cys Val Phe Lys Thr Ala Gln Leu Pro Thr Ala Val 325 330 335 Leu Trp Gln Asn Gly Pro Leu Leu Trp Ile His Pro Asn Val Ser Pro 340 345 350 Ala Lys Ile Ile Asp Trp Tyr Pro Asp Ala Ile Ala Gln Leu Ala Leu 355 360 365 Lys Gly Leu Lys Ala Ala Ile Thr His Phe Gly Arg Ser Pro Tyr Leu 370 375 380 Leu Ile Val Pro Tyr Thr Ala Ala Gln Val Gln Thr Leu Ala Ala Thr 385 390 395 400 Ser Asn Asp Trp Ala Val Leu Val Thr Ser Phe Ser Gly Lys Ile Asp 405 410 415 Asn His Tyr Pro Lys His Pro Ile Leu Gln Phe Ala Gln Asn Gln Ser 420 425 430 Val Val Phe Pro Gln Ile Thr Val 435 440 21 404 PRT Hamster intracisternal A-particles 21 Leu Gly Ala Val Glu Ala Ser Arg Pro Ile Pro Trp Lys Thr Glu Glu 1 5 10 15 Pro Leu Trp Val Ser Gln Trp Pro Leu Ser Ser Glu Lys Leu Glu Ala 20 25 30 Val Thr Arg Leu Val Gln Glu Gln Glu Arg Leu Gly His Leu Glu Pro 35 40 45 Ser Thr Ser Pro Trp Asn Thr Pro Ile Phe Val Ile Lys Lys Lys Ser 50 55 60 Gly Lys Trp Arg Leu Leu His Asp Leu Arg Ala Ile Asn Asn Gln Met 65 70 75 80 His Leu Phe Gly Pro Val Gln Arg Gly Leu Pro Leu Leu Ser Ala Leu 85 90 95 Pro Gln Asp Trp Lys Leu Ile Ile Ile Asp Ile Lys Asp Cys Phe Phe 100 105 110 Ser Ile Pro Leu Tyr Pro Arg Asp Arg Pro Arg Phe Ala Phe Thr Ile 115 120 125 Pro Ser Leu Asn His Met Glu Pro Asp Lys Arg Phe Gln Trp Lys Val 130 135 140 Leu Pro Gln Gly Met Ala Asn Ser Pro Thr Ile Cys Gln Leu Tyr Val 145 150 155 160 Gln Glu Ala Leu Glu Pro Ile Arg Lys Gln Phe Thr Ser Leu Ile Val 165 170 175 Ile His Tyr Met Asp Asp Ile Leu Ile Cys His Lys Glu Leu Asp Val 180 185 190 Leu Gln Lys Ala Phe Pro Met Leu Val Ala Glu Leu Lys Gln Trp Gly 195 200 205 Leu Glu Ile Ala Ser Glu Lys Val Gln Ile Ala Asp Thr Gly Leu Phe 210 215 220 Leu Gly Ser Lys Ile Thr Pro Lys Asn Ile Val Pro Gln Lys Ile Glu 225 230 235 240 Ile Arg Lys Asp His Leu Gln Thr Leu Asn Asp Phe Gln Lys Leu Leu 245 250 255 Gly Asp Ile Asn Trp Leu Arg Pro Phe Leu Lys Ile Pro Ser Ala Asp 260 265 270 Leu Lys Pro Leu Phe Asp Leu Leu Glu Gly Glu Pro His Ile Ser Ser 275 280 285 Pro Arg Lys Phe Thr Pro Ala Ala His Arg Ala Leu Gln Met Val Glu 290 295 300 Glu Ala Leu Gln Glu Ala Gln Arg Arg Thr Asn Ser Pro Ala Lys Ile 305 310 315 320 Ile Asp Trp Tyr Pro Asp Ala Val Ala Gln Pro Arg Ser Arg Ile Lys 325 330 335 Ala Ala Val Thr His Phe Gly Arg Asp Pro Asp Ser Leu Ile Val Pro 340 345 350 Tyr Thr Ala Ala Gln Val Gln Thr Leu Ala Ala Thr Ser Ser Asp Trp 355 360 365 Ala Val Leu Val Thr Ser Phe Ser Gly Gln Ile Asp Asn His Phe Pro 370 375 380 Lys His Pro Ile Leu Gln Phe Ala Leu Asn Gln Ala Ile Val Phe Pro 385 390 395 400 Gln Val Thr Ala 22 438 PRT Rous sarcoma virus 22 Thr Val Ala Leu His Leu Ala Ile Pro Leu Lys Trp Lys Pro Asp His 1 5 10 15 Thr Pro Val Trp Ile Asp Gln Trp Pro Leu Pro Glu Gly Lys Leu Val 20 25 30 Ala Leu Thr Gln Leu Val Glu Lys Glu Leu Gln Leu Gly His Ile Glu 35 40 45 Pro Ser Leu Ser Cys Trp Asn Thr Pro Val Phe Val Ile Arg Lys Ala 50 55 60 Ser Gly Ser Tyr Arg Leu Leu His Asp Leu Arg Ala Val Asn Ala Lys 65 70 75 80 Leu Val Pro Phe Gly Ala Val Gln Gln Gly Ala Pro Val Leu Ser Ala 85 90 95 Leu Pro Arg Gly Trp Pro Leu Met Val Leu Asp Leu Lys Asp Cys Phe 100 105 110 Phe Ser Ile Pro Ile Ala Glu Gln Asp Arg Glu Ala Phe Ala Phe Thr 115 120 125 Leu Pro Ser Val Asn Asn Gln Ala Pro Ala Arg Arg Phe Gln Trp Lys 130 135 140 Val Leu Pro Gln Gly Met Thr Cys Ser Pro Thr Ile Cys Gln Leu Val 145 150 155 160 Val Gly Gln Val Leu Glu Pro Leu Arg Leu Lys His Pro Ser Leu Cys 165 170 175 Met Leu His Tyr Met Asp Asp Leu Leu Leu Ala Ala Ser Ser His Asp 180 185 190 Gly Leu Glu Ala Ala Gly Glu Glu Val Ile Ser Thr Leu Glu Arg Ala 195 200 205 Gly Phe Thr Ile Ser Pro Asp Lys Val Gln Arg Glu Pro Gly Val Gln 210 215 220 Tyr Leu Gly Tyr Lys Leu Gly Ser Thr Tyr Val Ala Pro Val Gly Leu 225 230 235 240 Val Ala Glu Pro Arg Ile Ala Thr Leu Trp Asp Val Gln Lys Leu Val 245 250 255 Gly Ser Leu Gln Trp Leu Arg Pro Ala Leu Gly Ile Pro Pro Arg Leu 260 265 270 Met Gly Pro Phe Tyr Glu Gln Leu Arg Gly Ser Asp Pro Asn Glu Ala 275 280 285 Arg Glu Trp Asn Leu Asp Met Lys Met Ala Trp Arg Glu Ile Val Arg 290 295 300 Leu Ser Thr Thr Ala Ala Leu Glu Arg Trp Asp Pro Ala Leu Pro Leu 305 310 315 320 Glu Gly Ala Val Ala Arg Cys Glu Gln Gly Ala Ile Gly Val Leu Gly 325 330 335 Gln Gly Leu Ser Thr His Pro Arg Pro Cys Leu Trp Leu Phe Ser Thr 340 345 350 Gln Pro Thr Lys Ala Phe Thr Ala Trp Leu Glu Val Leu Thr Leu Leu 355 360 365 Ile Thr Lys Leu Arg Ala Ser Ala Val Arg Thr Phe Gly Lys Glu Val 370 375 380 Asp Ile Leu Leu Leu Pro Ala Cys Phe Arg Glu Asp Leu Pro Leu Pro 385 390 395 400 Glu Gly Ile Leu Leu Ala Leu Lys Gly Phe Ala Gly Lys Ile Arg Ser 405 410 415 Ser Asp Thr Pro Ser Ile Phe Asp Ile Ala Arg Pro Leu His Val Ser 420 425 430 Leu Lys Val Arg Val Thr 435 23 453 PRT Human T-cell lymphotropic virus type 1 23 Asn Leu Ala Asn Thr Gly Ala Ser Arg Pro Trp Ala Arg Thr Pro Pro 1 5 10 15 Lys Ala Pro Arg Asn Gln Pro Val Pro Phe Lys Pro Glu Arg Leu Gln 20 25 30 Ala Leu Gln His Leu Val Arg Lys Ala Leu Glu Ala Gly His Ile Glu 35 40 45 Pro Tyr Thr Gly Pro Gly Asn Asn Pro Val Phe Pro Val Lys Lys Ala 50 55 60 Asn Gly Thr Trp Arg Phe Ile His Asp Leu Arg Ala Thr Asn Ser Leu 65 70 75 80 Thr Ile Asp Leu Ser Ser Ser Ser Pro Gly Pro Pro Asp Leu Ser Ser 85 90 95 Leu Pro Thr Thr Leu Ala His Leu Gln Thr Ile Asp Leu Arg Asp Ala 100 105 110 Phe Phe Gln Ile Pro Leu Pro Lys Gln Phe Gln Pro Tyr Phe Ala Phe 115 120 125 Thr Val Pro Gln Gln Cys Asn Tyr Gly Pro Gly Thr Arg Tyr Ala Trp 130 135 140 Lys Val Leu Pro Gln Gly Phe Lys Asn Ser Pro Thr Leu Phe Glu Met 145 150 155 160 Gln Leu Ala His Ile Leu Gln Pro Ile Arg Gln Ala Phe Pro Gln Cys 165 170 175 Thr Ile Leu Gln Tyr Met Asp Asp Ile Leu Leu Ala Ser Pro Ser His 180 185 190 Glu Asp Leu Leu Leu Leu Ser Glu Ala Thr Met Ala Ser Leu Ile Ser 195 200 205 His Gly Leu Pro Val Ser Glu Asn Lys Thr Gln Gln Thr Pro Gly Thr 210 215 220 Ile Lys Phe Leu Gly Gln Ile Ile Ser Pro Asn His Leu Thr Tyr Asp 225 230 235 240 Ala Val Pro Thr Val Pro Ile Arg Ser Arg Trp Ala Leu Pro Glu Leu 245 250 255 Gln Ala Leu Leu Gly Glu Ile Gln Trp Val Ser Lys Gly Thr Pro Thr 260 265 270 Leu Arg Gln Pro Leu His Ser Leu Tyr Cys Ala Leu Gln Arg His Thr 275 280 285 Asp Pro Arg Asp Gln Ile Tyr Leu Asn Pro Ser Gln Val Gln Ser Leu 290 295 300 Val Gln Leu Arg Gln Ala Leu Ser Gln Asn Cys Arg Ser Arg Leu Val 305 310 315 320 Gln Thr Leu Pro Leu Leu Gly Ala Ile Met Leu Thr Leu Thr Gly Thr 325 330 335 Thr Thr Val Val Phe Gln Ser Lys Glu Gln Trp Pro Leu Val Trp Leu 340 345 350 His Ala Pro Leu Pro His Thr Ser Gln Cys Pro Trp Gly Gln Leu Leu 355 360 365 Ala Ser Ala Val Leu Leu Leu Asp Lys Tyr Thr Leu Gln Ser Tyr Gly 370 375 380 Leu Leu Cys Gln Thr Ile His His Asn Ile Ser Thr Gln Thr Phe Asn 385 390 395 400 Gln Phe Ile Gln Thr Ser Asp His Pro Ser Val Pro Ile Leu Leu His 405 410 415 His Ser His Arg Phe Lys Asn Leu Gly Ala Gln Thr Gly Glu Leu Trp 420 425 430 Asn Thr Phe Leu Lys Thr Ala Ala Pro Leu Ala Pro Val Lys Ala Leu 435 440 445 Met Pro Val Phe Thr 450 24 528 PRT Human T-cell lymphotropic virus type 2 24 Gly Lys Ala Pro Arg His Pro Asp Pro Arg Arg Gln Trp Ala Asn Gln 1 5 10 15 His Pro Val Gln Thr Pro Pro Asn Pro Pro Thr His Ile Leu Ala Leu 20 25 30 Pro Lys Val Pro Arg Tyr Pro Phe Leu Leu Pro Leu Arg His Pro Gln 35 40 45 Gln Met Asp His His Trp Lys Gly Arg Pro Thr Thr Met Pro Gly Ala 50 55 60 Ser Ile Pro Pro Arg Arg Pro Gln Pro Pro Pro Ile Ala Ala Asn Ser 65 70 75 80 His Ser Lys His His Arg Pro Arg Thr Pro Ser Pro Thr Ser Pro Ser 85 90 95 Gly Pro Ile Ser Phe Lys Pro Glu Arg Leu Gln Ala Leu Asn Asp Leu 100 105 110 Val Ser Lys Ala Leu Glu Ala Gly His Ile Glu Pro Tyr Ser Gly Pro 115 120 125 Gly Asn Asn Pro Val Phe Pro Val Lys Lys Pro Asn Gly Lys Trp Arg 130 135 140 Phe Ile His Asp Leu Arg Ala Thr Asn Ala Ile Thr Thr Thr Leu Thr 145 150 155 160 Ser Pro Ser Pro Gly Pro Pro Asp Leu Thr Ser Leu Pro Thr Ala Leu 165 170 175 Pro His Leu Gln Thr Ile Asp Leu Thr Asp Ala Phe Phe Gln Ile Pro 180 185 190 Leu Pro Lys Gln Tyr Gln Pro Tyr Phe Ala Phe Thr Ile Pro Gln Pro 195 200 205 Cys Asn Tyr Gly Pro Gly Thr Arg Tyr Ala Trp Thr Val Leu Pro Gln 210 215 220 Gly Phe Lys Asn Ser Pro Thr Leu Phe Glu Gln Gln Leu Ala Ala Val 225 230 235 240 Leu Asn Pro Met Arg Lys Met Phe Pro Thr Ser Thr Ile Val Gln Tyr 245 250 255 Met Asp Asp Ile Leu Leu Ala Ser Pro Thr Asn Glu Glu Leu Gln Gln 260 265 270 Leu Ser Gln Leu Thr Leu Gln Ala Leu Thr Thr His Gly Leu Pro Ile 275 280 285 Ser Gln Glu Lys Thr Gln Gln Thr Pro Gly Gln Ile Arg Phe Leu Gly 290 295 300 Gln Val Ile Ser Pro Asn His Ile Thr Tyr Glu Ser Thr Pro Thr Ile 305 310 315 320 Pro Ile Lys Ser Gln Trp Thr Leu Thr Glu Leu Gln Val Ile Leu Gly 325 330 335 Glu Ile Gln Trp Val Ser Lys Gly Thr Pro Ile Leu Arg Lys His Leu 340 345 350 Gln Ser Leu Tyr Ser Ala Leu His Gly Tyr Arg Asp Pro Arg Ala Cys 355 360 365 Ile Thr Leu Thr Pro Gln Gln Leu His Ala Leu His Ala Ile Gln Gln 370 375 380 Ala Leu Gln His Asn Cys Arg Gly Arg Leu Asn Pro Ala Leu Pro Leu 385 390 395 400 Leu Gly Leu Ile Ser Leu Ser Thr Ser Gly Thr Thr Ser Val Ile Phe 405 410 415 Gln Pro Lys Gln Asn Trp Pro Leu Ala Trp Leu His Thr Pro His Pro 420 425 430 Pro Thr Ser Leu Cys Pro Trp Gly His Leu Leu Ala Cys Thr Ile Leu 435 440 445 Thr Leu Asp Lys Tyr Thr Leu Gln His Tyr Gly Gln Leu Cys Gln Ser 450 455 460 Phe His His Asn Met Ser Lys Gln Ala Leu Cys Asp Phe Leu Arg Asn 465 470 475 480 Ser Pro His Pro Ser Val Gly Ile Leu Ile His His Met Gly Arg Phe 485 490 495 His Asn Leu Gly Ser Gln Pro Ser Gly Pro Trp Lys Thr Leu Leu His 500 505 510 Leu Pro Thr Leu Leu Gln Glu Pro Arg Leu Leu Arg Pro Ile Phe Thr 515 520 525 25 473 PRT Feline endogenous virus 25 Gly Arg Ala Lys Cys Gln Val Pro Ile Ile Ile Asp Leu Lys Pro Thr 1 5 10 15 Ala Met Pro Val Ser Ile Arg Gln Tyr Pro Met Ser Lys Glu Ala His 20 25 30 Met Gly Ile Gln Pro His Ile Thr Arg Phe Leu Glu Leu Gly Val Leu 35 40 45 Arg Pro Cys Arg Ser Pro Trp Asn Thr Pro Leu Leu Pro Val Lys Lys 50 55 60 Pro Gly Thr Arg Asp Tyr Arg Pro Val Gln Asp Leu Arg Glu Val Asn 65 70 75 80 Lys Arg Thr Met Asp Ile His Pro Thr Val Pro Asn Pro Tyr Asn Leu 85 90 95 Leu Ser Thr Leu Ser Pro Asp Arg Thr Trp Tyr Thr Val Leu Asp Leu 100 105 110 Lys Asp Ala Phe Phe Cys Leu Pro Leu Ala Pro Gln Ser Gln Glu Leu 115 120 125 Phe Ala Phe Glu Trp Arg Asp Pro Glu Arg Gly Ile Ser Gly Gln Leu 130 135 140 Thr Trp Thr Arg Leu Pro Gln Gly Phe Lys Asn Ser Pro Thr Leu Phe 145 150 155 160 Asp Glu Ala Leu His Arg Asp Leu Thr Asp Phe Arg Thr Gln His Pro 165 170 175 Glu Val Thr Leu Leu Gln Tyr Val Asp Asp Leu Leu Leu Ala Ala Pro 180 185 190 Thr Lys Glu Ala Cys Ile Arg Gly Thr Lys His Leu Leu Arg Glu Leu 195 200 205 Gly Asp Lys Gly Tyr Arg Ala Ser Ala Lys Lys Ala Gln Ile Cys Gln 210 215 220 Thr Lys Val Thr Tyr Leu Gly Tyr Ile Leu Ser Glu Gly Lys Arg Trp 225 230 235 240 Leu Thr Pro Gly Arg Ile Glu Thr Val Ala His Ile Pro Pro Pro Gln 245 250 255 Asn Pro Arg Glu Val Arg Glu Phe Leu Gly Thr Ala Gly Phe Cys Arg 260 265 270 Leu Trp Ile Pro Gly Phe Ala Glu Leu Ala Ala Pro Leu Tyr Ala Leu 275 280 285 Thr Lys Glu Ser Ala Pro Phe Thr Trp Gln Glu Lys His Gln Ser Ala 290 295 300 Phe Glu Ala Leu Lys Glu Ala Leu Leu Ser Ala Pro Ala Leu Gly Leu 305 310 315 320 Pro Asp Thr Ser Lys Pro Phe Thr Leu Phe Ile Asp Glu Lys Gln Gly 325 330 335 Ile Ala Lys Gly Val Leu Thr Gln Lys Leu Gly Pro Trp Lys Arg Pro 340 345 350 Val Ala Tyr Leu Ser Lys Lys Leu Asp Pro Val Ala Ala Gly Trp Pro 355 360 365 Pro Cys Leu Arg Ile Met Ala Ala Thr Ala Met Leu Val Lys Asp Ser 370 375 380 Ala Lys Leu Thr Leu Gly Gln Pro Leu Thr Val Ile Thr Pro His Ala 385 390 395 400 Leu Ala Ala Ile Val Arg Gln Thr Pro Asp Arg Trp Ile Thr Asn Ala 405 410 415 Ala Arg Leu Thr His Tyr Gln Ala Leu Leu Leu Asp Thr Asp Arg Ile 420 425 430 Gln Phe Gly Pro Pro Val Thr Leu Asn Pro Ala Thr Leu Leu Pro Ala 435 440 445 Pro Glu Asp Gln Gln Ser Ala His Asp Cys Arg Gln Val Leu Ala Glu 450 455 460 Thr His Gly Thr Arg Glu Asp Leu Lys 465 470 26 472 PRT Baboon endogenous virus 26 Gly Arg Ala Lys Cys Gln Ala Pro Ile Ile Ile Asp Leu Lys Pro Thr 1 5 10 15 Ala Val Pro Val Ser Ile Lys Gln Tyr Pro Met Ser Leu Glu Ala His 20 25 30 Met Gly Ile Arg Gln His Ile Ile Lys Phe Leu Glu Leu Gly Val Leu 35 40 45 Arg Pro Cys Arg Ser Pro Trp Asn Thr Pro Leu Leu Pro Val Lys Lys 50 55 60 Pro Gly Thr Gln Asp Tyr Arg Pro Val Gln Asp Leu Arg Glu Ile Asn 65 70 75 80 Lys Arg Thr Val Asp Ile His Pro Thr Val Pro Asn Pro Tyr Asn Leu 85 90 95 Leu Ser Thr Leu Lys Pro Asp Tyr Ser Trp Tyr Thr Val Leu Asp Leu 100 105 110 Lys Asp Ala Phe Phe Cys Leu Pro Leu Ala Pro Gln Ser Gln Glu Leu 115 120 125 Phe Ala Phe Glu Trp Lys Asp Pro Glu Arg Gly Ile Ser Gly Gln Leu 130 135 140 Thr Trp Thr Arg Leu Pro Gln Gly Phe Lys Asn Ser Pro Thr Leu Phe 145 150 155 160 Asp Glu Ala Leu His Arg Asp Leu Thr Asp Phe Arg Thr Gln His Pro 165 170 175 Glu Val Thr Leu Leu Gln Tyr Val Asp Asp Leu Leu Leu Ala Ala Pro 180 185 190 Thr Lys Lys Ala Cys Thr Gln Gly Thr Arg His Leu Leu Gln Glu Leu 195 200 205 Gly Glu Lys Gly Tyr Arg Ala Ser Ala Lys Lys Ala Gln Ile Cys Gln 210 215 220 Thr Lys Val Thr Tyr Leu Gly Tyr Ile Leu Ser Glu Gly Lys Arg Trp 225 230 235 240 Leu Thr Pro Gly Arg Ile Glu Thr Val Ala Arg Ile Pro Pro Pro Arg 245 250 255 Asn Pro Arg Glu Val Arg Glu Phe Leu Gly Thr Ala Gly Phe Cys Arg 260 265 270 Leu Trp Ile Pro Gly Phe Ala Glu Leu Ala Ala Pro Leu Tyr Ala Leu 275 280 285 Thr Lys Glu Ser Thr Pro Phe Thr Trp Gln Thr Glu His Gln Leu Ala 290 295 300 Phe Glu Ala Leu Lys Lys Ala Leu Leu Ser Ala Pro Ala Leu Gly Leu 305 310 315 320 Pro Asp Thr Ser Lys Pro Phe Thr Leu Phe Leu Asp Glu Arg Gln Gly 325 330 335 Ile Ala Lys Gly Val Leu Thr Gln Lys Leu Gly Pro Trp Lys Arg Pro 340 345 350 Val Ala Tyr Leu Ser Lys Lys Leu Asp Pro Val Ala Ala Gly Trp Pro 355 360 365 Pro Cys Leu Arg Ile Met Ala Ala Thr Ala Met Leu Val Lys Asp Ser 370 375 380 Ala Lys Leu Thr Leu Gly Gln Pro Leu Thr Val Ile Thr Pro His Thr 385 390 395 400 Leu Glu Ala Ile Val Arg Gln Pro Pro Asp Arg Trp Ile Thr Asn Ala 405 410 415 Arg Leu Thr His Tyr Gln Ala Leu Leu Leu Asp Thr Asp Arg Val Gln 420 425 430 Phe Gly Pro Pro Val Thr Leu Asn Pro Ala Thr Leu Leu Glu Val Pro 435 440 445 Glu Asn Gln Pro Ser Pro His Asp Cys Arg Gln Val Leu Ala Glu Thr 450 455 460 His Gly Thr Arg Glu Asp Leu Lys 465 470 27 471 PRT Friend murine leukemia virus 27 Gly Leu Ala Val Arg Gln Ala Pro Leu Ile Ile Pro Leu Lys Ala Thr 1 5 10 15 Ser Thr Pro Val Ser Ile Lys Gln Tyr Pro Met Ser Gln Glu Ala Arg 20 25 30 Leu Gly Ile Lys Pro His Ile Gln Arg Leu Leu Asp Gln Gly Ile Leu 35 40 45 Val Pro Cys Gln Ser Pro Trp Asn Thr Pro Leu Leu Pro Val Lys Lys 50 55 60 Pro Gly Thr Asn Asp Tyr Arg Pro Val Gln Asp Leu Arg Glu Val Asn 65 70 75 80 Lys Arg Val Glu Asp Ile His Pro Thr Val Pro Asn Pro Tyr Asn Leu 85 90 95 Leu Ser Gly Leu Pro Pro Ser His Gln Trp Tyr Thr Val Leu Asp Leu 100 105 110 Lys Asp Ala Phe Phe Cys Leu Arg Leu His Pro Thr Ser Gln Ser Leu 115 120 125 Phe Ala Phe Glu Trp Arg Asp Pro Glu Met Gly Ile Ser Gly Gln Leu 130 135 140 Thr Trp Thr Arg Leu Pro Gln Gly Phe Lys Asn Ser Pro Thr Leu Phe 145 150 155 160 Asp Glu Ala Leu His Arg Asp Leu Ala Asp Phe Arg Ile Gln His Pro 165 170 175 Asp Leu Ile Leu Leu Gln Tyr Val Asp Asp Leu Leu Leu Ala Ala Thr 180 185 190 Ser Glu Leu Asp Cys Gln Gln Gly Thr Arg Ala Leu Leu Gln Thr Leu 195 200 205 Gly Asp Leu Gly Tyr Arg Ala Ser Ala Lys Lys Ala Gln Ile Cys Gln 210 215 220 Lys Gln Val Lys Tyr Leu Gly Tyr Leu Leu Lys Glu Gly Gln Arg Trp 225 230 235 240 Leu Thr Glu Ala Arg Lys Glu Thr Val Met Gly Gln Pro Thr Pro Lys 245 250 255 Thr Pro Arg Gln Leu Arg Glu Phe Leu Gly Thr Ala Gly Phe Cys Arg 260 265 270 Leu Trp Ile Pro Gly Phe Ala Glu Met Ala Ala Pro Leu Tyr Pro Leu 275 280 285 Thr Lys Thr Gly Thr Leu Phe Glu Trp Gly Pro Asp Gln Gln Lys Ala 290 295 300 Tyr Gln Glu Ile Lys Gln Ala Leu Leu Thr Ala Pro Ala Leu Gly Leu 305 310 315 320 Pro Asp Leu Thr Lys Pro Phe Glu Leu Phe Val Asp Glu Lys Gln Gly 325 330 335 Tyr Ala Lys Gly Val Leu Thr Gln Lys Leu Gly Pro Trp Arg Arg Pro 340 345 350 Val Ala Tyr Leu Ser Lys Lys Leu Asp Pro Val Ala Ala Gly Trp Pro 355 360 365 Pro Cys Leu Arg Met Val Ala Ala Ile Ala Val Leu Thr Lys Asp Ala 370 375 380 Gly Lys Leu Thr Met Gly Gln Pro Leu Val Ile Leu Ala Pro His Ala 385 390 395 400 Val Glu Ala Leu Val Lys Gln Pro Pro Asp Arg Trp Leu Ser Asn Ala 405 410 415 Arg Met Thr His Tyr Gln Ala Leu Leu Leu Asp Thr Asp Arg Val Gln 420 425 430 Phe Gly Pro Ile Val Ala Leu Asn Pro Ala Thr Leu Leu Pro Leu Pro 435 440 445 Glu Glu Gly Leu Gln His Asp Cys Leu Glu Ile Leu Ala Glu Ala His 450 455 460 Gly Thr Arg Pro Asp Leu Thr 465 470 28 470 PRT Moloney murine leukemia virus 28 Gly Leu Ala Val Arg Gln Ala Pro Leu Ile Ile Pro Leu Lys Ala Thr 1 5 10 15 Ser Thr Pro Val Ser Ile Lys Gln Tyr Pro Met Ser Gln Glu Ala Arg 20 25 30 Leu Gly Ile Lys Pro His Ile Gln Arg Leu Leu Asp Gln Gly Ile Leu 35 40 45 Val Pro Cys Gln Ser Pro Trp Asn Thr Pro Leu Leu Pro Val Lys Lys 50 55 60 Pro Gly Thr Asn Asp Tyr Arg Pro Val Gln Asp Leu Arg Glu Val Asn 65 70 75 80 Lys Arg Val Glu Asp Ile His Pro Thr Val Pro Asn Pro Tyr Asn Leu 85 90 95 Leu Ser Gly Leu Pro Pro Ser His Gln Trp Tyr Thr Val Leu Asp Leu 100 105 110 Lys Asp Ala Phe Phe Cys Leu Arg Leu His Pro Thr Ser Gln Pro Leu 115 120 125 Phe Ala Phe Glu Trp Arg Asp Pro Glu Met Gly Ile Ser Gly Gln Leu 130 135 140 Thr Trp Thr Arg Leu Pro Gln Gly Phe Lys Asn Ser Pro Thr Leu Phe 145 150 155 160 Asp Glu Ala Leu His Arg Asp Leu Ala Asp Phe Arg Ile Gln His Pro 165 170 175 Asp Leu Ile Leu Leu Gln Tyr Val Asp Asp Leu Leu Leu Ala Ala Thr 180 185 190 Ser Glu Leu Asp Cys Gln Gln Gly Thr Arg Ala Leu Leu Gln Thr Leu 195 200 205 Gly Asn Leu Gly Tyr Arg Ala Ser Ala Lys Lys Ala Gln Ile Cys Gln 210 215 220 Lys Gln Val Lys Tyr Leu Gly Tyr Leu Leu Lys Glu Gly Gln Arg Trp 225 230 235 240 Leu Thr Glu Ala Arg Lys Glu Thr Val Met Gly Gln Pro Thr Pro Lys 245 250 255 Thr Pro Arg Gln Leu Arg Glu Phe Leu Gly Thr Ala Gly Phe Cys Arg 260 265 270 Leu Trp Ile Pro Gly Phe Ala Glu Met Ala Ala Pro Leu Tyr Pro Leu 275 280 285 Thr Lys Thr Gly Thr Leu Phe Asn Trp Gly Pro Asp Gln Gln Lys Ala 290 295 300 Tyr Gln Glu Ile Lys Gln Ala Leu Leu Thr Ala Pro Ala Leu Gly Leu 305 310 315 320 Pro Asp Leu Thr Lys Pro Phe Glu Leu Phe Val Asp Glu Lys Gln Gly 325 330 335 Tyr Ala Lys Gly Val Leu Thr Gln Lys Leu Gly Pro Trp Arg Arg Pro 340 345 350 Val Ala Tyr Leu Ser Lys Lys Leu Asp Pro Val Ala Ala Gly Trp Pro 355 360 365 Pro Cys Leu Arg Met Val Ala Ala Ile Ala Leu Thr Lys Asp Ala Gly 370 375 380 Lys Leu Thr Met Gly Gln Pro Leu Val Ile Ile Ala Pro His Ala Val 385 390 395 400 Glu Ala Ile Val Lys Gln Pro Pro Asp Arg Trp Leu Ser Asn Ala Arg 405 410 415 Met Thr His Tyr Gln Ala Leu Leu Leu Asp Thr Asp Arg Val Gln Phe 420 425 430 Gly Pro Val Val Ala Leu Asn Pro Ala Thr Leu Leu Pro Leu Pro Glu 435 440 445 Glu Gly Leu Gln His Asn Cys Leu Asp Ile Leu Ala Glu Ala His Gly 450 455 460 Thr Arg Pro Asp Leu Thr 465 470 29 471 PRT AKV murine leukemia virus 29 Gly Leu Ala Val Arg Gln Ala Pro Leu Ile Ile Pro Leu Lys Ala Thr 1 5 10 15 Ser Thr Pro Val Ser Ile Lys Gln Tyr Pro Met Ser Gln Glu Ala Lys 20 25 30 Leu Gly Ile Lys Pro His Ile Gln Arg Leu Leu Asp Gln Gly Ile Leu 35 40 45 Val Pro Cys Gln Ser Pro Trp Asn Thr Pro Leu Leu Pro Val Lys Lys 50 55 60 Pro Gly Thr Asn Asp Tyr Arg Pro Val Gln Asp Leu Arg Glu Val Asn 65 70 75 80 Lys Arg Val Glu Asp Ile His Pro Thr Val Pro Asn Pro Tyr Asn Leu 85 90 95 Leu Ser Gly Leu Pro Pro Ser His Arg Trp Tyr Thr Val Leu Asp Leu 100 105 110 Lys Asp Ala Phe Phe Cys Leu Arg Leu His Pro Thr Ser Gln Pro Leu 115 120 125 Phe Ala Phe Glu Trp Arg Asp Pro Gly Met Gly Ile Ser Gly Gln Leu 130 135 140 Thr Trp Thr Arg Leu Pro Gln Gly Phe Lys Asn Ser Pro Thr Leu Phe 145 150 155 160 Asp Glu Ala Leu His Arg Asp Leu Ala Asp Phe Arg Ile Gln His Pro 165 170 175 Asp Leu Ile Leu Leu Gln Tyr Val Asp Asp Ile Leu Ser Ala Ala Thr 180 185 190 Ser Glu Leu Asp Cys Gln Gln Gly Thr Arg Ala Leu Leu Leu Thr Leu 195 200 205 Gly Asn Leu Gly Tyr Arg Ala Ser Ala Lys Lys Ala Gln Leu Cys Gln 210 215 220 Lys Gln Val Lys Tyr Leu Gly Tyr Leu Leu Lys Glu Gly Gln Arg Trp 225 230 235 240 Leu Thr Glu Ala Arg Lys Glu Thr Val Met Gly Gln Pro Thr Pro Lys 245 250 255 Thr Pro Arg Gln Leu Arg Glu Phe Leu Gly Thr Ala Gly Phe Cys Arg 260 265 270 Leu Trp Ile Pro Gly Phe Ala Glu Met Ala Ala Pro Leu Tyr Pro Leu 275 280 285 Thr Lys Thr Gly Thr Leu Phe Asn Trp Gly Pro Asp Gln Gln Lys Ala 290 295 300 Tyr Gln Glu Ile Lys Gln Ala Leu Leu Thr Ala Pro Ala Leu Gly Leu 305 310 315 320 Pro Asp Leu Thr Lys Pro Phe Glu Leu Phe Val Asp Glu Lys Gln Gly 325 330 335 Tyr Ala Lys Gly Val Leu Thr Gln Lys Leu Gly Pro Trp Arg Arg Pro 340 345 350 Val Ala Tyr Leu Ser Lys Lys Leu Asp Pro Val Ala Ala Gly Trp Pro 355 360 365 Pro Cys Leu Arg Met Val Ala Ala Ile Ala Val Leu Thr Lys Asp Ala 370 375 380 Gly Lys Leu Thr Met Gly Gln Pro Leu Val Ile Leu Ala Pro His Ala 385 390 395 400 Val Glu Ala Leu Val Lys Gln Pro Pro Asp Arg Trp Leu Ser Asn Ala 405 410 415 Arg Met Thr His Tyr Gln Ala Met Leu Leu Asp Thr Asp Arg Val Gln 420 425 430 Phe Gly Pro Val Val Ala Leu Asn Pro Ala Thr Leu Leu Pro Leu Pro 435 440 445 Glu Glu Gly Ala Pro His Asp Cys Leu Glu Ile Leu Ala Glu Thr His 450 455 460 Gly Thr Arg Pro Asp Leu Thr 465 470 30 471 PRT Radiation murine leukemia virus 30 Gly Leu Ala Val Arg Gln Ala Pro Leu Ile Ile Pro Leu Lys Ala Thr 1 5 10 15 Ser Thr Pro Val Ser Ile Lys Gln Tyr Pro Met Ser Gln Glu Ala Lys 20 25 30 Leu Gly Ile Lys Pro His Ile Gln Arg Leu Leu Asp Gln Gly Ile Leu 35 40 45 Val Pro Cys Gln Ser Pro Trp Asn Thr Pro Leu Leu Pro Val Lys Lys 50 55 60 Pro Gly Thr Asn Asp Tyr Arg Pro Val Gln Gly Leu Arg Glu Val Asn 65 70 75 80 Lys Arg Val Glu Asp Ile His Pro Thr Val Pro Asn Pro Tyr Asn Leu 85 90 95 Leu Ser Gly Leu Pro Thr Ser His Arg Trp Tyr Thr Val Leu Asp Leu 100 105 110 Lys Asp Ala Phe Phe Cys Leu Arg Leu His Pro Thr Ser Gln Pro Leu 115 120 125 Phe Ala Ser Glu Trp Arg Asp Pro Gly Met Gly Ile Ser Gly Gln Leu 130 135 140 Thr Trp Thr Arg Leu Pro Gln Gly Phe Lys Asn Ser Pro Thr Leu Phe 145 150 155 160 Asp Glu Ala Leu His Arg Gly Leu Ala Asp Phe Arg Ile Gln His Pro 165 170 175 Asp Leu Ile Leu Leu Gln Tyr Val Asp Asp Leu Leu Leu Ala Ala Thr 180 185 190 Ser Glu Leu Asp Cys Gln Gln Gly Thr Arg Ala Leu Leu Lys Thr Leu 195 200 205 Gly Asn Leu Gly Tyr Arg Ala Ser Ala Lys Lys Ala Gln Ile Cys Gln 210 215 220 Lys Gln Val Lys Tyr Leu Gly Tyr Leu Leu Arg Glu Gly Gln Arg Trp 225 230 235 240 Leu Thr Glu Ala Arg Lys Glu Thr Val Met Gly Gln Pro Thr Pro Lys 245 250 255 Thr Pro Arg Gln Leu Arg Glu Phe Leu Gly Thr Ala Gly Phe Cys Arg 260 265 270 Leu Trp Ile Pro Arg Phe Ala Glu Met Ala Ala Pro Leu Tyr Pro Leu 275 280 285 Thr Lys Thr Gly Thr Leu Phe Asn Trp Gly Pro Asp Gln Gln Lys Ala 290 295 300 Tyr His Glu Ile Lys Gln Ala Leu Leu Thr Ala Pro Ala Leu Gly Leu 305 310 315 320 Pro Asp Leu Thr Lys Pro Phe Glu Leu Phe Val Asp Glu Lys Gln Gly 325 330 335 Tyr Ala Lys Gly Val Leu Thr Gln Lys Leu Gly Pro Trp Arg Arg Pro 340 345 350 Val Ala Tyr Leu Ser Lys Lys Leu Asp Pro Val Ala Ala Gly Trp Pro 355 360 365 Pro Cys Leu Arg Met Val Ala Ala Ile Ala Val Leu Thr Lys Asp Ala 370 375 380 Gly Lys Leu Thr Met Gly Gln Pro Leu Val Ile Leu Ala Pro His Ala 385 390 395 400 Val Glu Ala Leu Val Lys Gln Pro Pro Asp Arg Trp Leu Ser Asn Ala 405 410 415 Arg Met Thr His Tyr Gln Ala Met Leu Leu Asp Thr Asp Arg Val Gln 420 425 430 Phe Gly Pro Val Val Ala Leu Asn Pro Ala Thr Leu Leu Pro Leu Pro 435 440 445 Glu Glu Gly Ala Pro His Asp Cys Leu Glu Ile Leu Ala Glu Glu Thr 450 455 460 Gly Thr Arg Arg Asp Leu Glu 465 470 31 471 PRT Gibbon endogenous virus 31 Gly Leu Ala Asn Gln Val Pro Pro Val Val Val Glu Leu Arg Ser Gly 1 5 10 15 Ala Ser Pro Val Ala Val Arg Gln Tyr Pro Met Ser Lys Glu Ala Arg 20 25 30 Glu Gly Ile Arg Pro His Ile Gln Lys Phe Leu Asp Leu Gly Val Leu 35 40 45 Val Pro Cys Arg Ser Pro Trp Asn Thr Pro Leu Leu Pro Val Lys Lys 50 55 60 Pro Gly Thr Asn Asp Tyr Arg Pro Val Gln Asp Leu Arg Glu Ile Asn 65 70 75 80 Lys Arg Val Gln Asp Ile His Pro Thr Val Pro Asn Pro Tyr Asn Leu 85 90 95 Leu Ser Ser Leu Pro Pro Ser Tyr Thr Trp Tyr Ser Val Leu Asp Leu 100 105 110 Lys Asp Ala Phe Phe Cys Leu Arg Leu His Pro Asn Ser Gln Pro Leu 115 120 125 Phe Ala Phe Glu Trp Lys Asp Pro Glu Lys Gly Asn Thr Gly Gln Leu 130 135 140 Thr Trp Thr Arg Leu Pro Gln Gly Phe Lys Asn Ser Pro Thr Leu Phe 145 150 155 160 Asp Glu Ala Leu His Arg Asp Leu Ala Pro Phe Arg Ala Leu Asn Pro 165 170 175 Gln Val Val Leu Leu Gln Tyr Val Asp Asp Leu Leu Val Ala Ala Pro 180 185 190 Thr Tyr Glu Asp Cys Lys Lys Gly Thr Gln Lys Leu Leu Gln Glu Leu 195 200 205 Ser Lys Leu Gly Tyr Arg Val Ser Ala Lys Lys Ala Gln Leu Cys Gln 210 215 220 Arg Glu Val Thr Tyr Leu Gly Tyr Leu Leu Lys Glu Gly Lys Arg Trp 225 230 235 240 Leu Thr Pro Ala Arg Lys Ala Thr Val Met Lys Ile Pro Val Pro Thr 245 250 255 Thr Pro Arg Gln Val Arg Glu Phe Leu Gly Thr Ala Gly Phe Cys Arg 260 265 270 Leu Trp Ile Pro Gly Phe Ala Ser Leu Ala Ala Pro Leu Tyr Pro Leu 275 280 285 Thr Lys Glu Ser Ile Pro Phe Ile Trp Thr Glu Glu His Gln Gln Ala 290 295 300 Phe Asp His Ile Lys Ile Ala Leu Leu Ser Ala Pro Ala Leu Ala Leu 305 310 315 320 Pro Asp Leu Thr Lys Pro Phe Thr Leu Tyr Ile Asp Glu Arg Ala Gly 325 330 335 Val Ala Arg Gly Val Leu Thr Gln Thr Leu Gly Pro Trp Arg Arg Pro 340 345 350 Val Ala Tyr Leu Ser Lys Lys Leu Asp Pro Val Ala Ser Gly Trp Pro 355 360 365 Thr Cys Leu Lys Ala Val Ala Ala Val Ala Leu Leu Leu Lys Asp Ala 370 375 380 Asp Lys Leu Thr Leu Gly Gln Asn Val Thr Val Ile Ala Ser His Ser 385 390 395 400 Leu Glu Ser Ile Val Arg Gln Pro Pro Asp Arg Trp Met Thr Asn Ala 405 410 415 Arg Met Thr His Tyr Gln Ser Leu Leu Leu Asn Glu Arg Val Ser Phe 420 425 430 Ala Pro Pro Ala Val Leu Asn Pro Ala Thr Leu Leu Pro Val Glu Ser 435 440 445 Glu Ala Thr Pro Val His Arg Cys Ser Glu Ile Leu Ala Glu Glu Thr 450 455 460 Gly Thr Arg Arg Asp Leu Glu 465 470 32 440 PRT Simian foamy virus type 1 32 Val Gly His Arg Arg Ile Lys Pro His Asn Ile Ala Thr Gly Thr Leu 1 5 10 15 Ala Pro Arg Pro Gln Lys Gln Tyr Pro Ile Asn Pro Lys Ala Lys Pro 20 25 30 Ser Ile Gln Ile Val Ile Asp Asp Leu Leu Lys Gln Gly Val Leu Ile 35 40 45 Gln Gln Asn Ser Thr Met Asn Thr Pro Val Tyr Pro Val Pro Lys Pro 50 55 60 Asp Gly Lys Trp Arg Met Val Leu Asp Tyr Arg Glu Val Asn Lys Thr 65 70 75 80 Ile Pro Leu Ile Ala Ala Gln Asn Gln His Ser Ala Gly Ile Leu Ser 85 90 95 Ser Ile Tyr Arg Gly Lys Tyr Lys Thr Thr Leu Asp Leu Thr Asn Gly 100 105 110 Phe Trp Ala His Pro Ile Thr Pro Glu Ser Tyr Trp Thr Thr Ala Phe 115 120 125 Thr Trp Gln Gly Lys Gln Tyr Cys Trp Thr Arg Leu Pro Gln Gly Phe 130 135 140 Leu Asn Ser Pro Ala Leu Phe Thr Ala Asp Val Val Asp Leu Leu Lys 145 150 155 160 Glu Ile Pro Asn Val Gln Ala Tyr Val Asp Asp Ile Tyr Ile Ser His 165 170 175 Asp Asp Pro Gln Glu His Leu Glu Gln Leu Glu Lys Ile Phe Ser Ile 180 185 190 Leu Leu Asn Ala Gly Tyr Val Val Ser Leu Lys Lys Ser Glu Ile Ala 195 200 205 Gln Arg Glu Val Glu Phe Leu Gly Phe Asn Ile Thr Lys Glu Gly Arg 210 215 220 Gly Leu Thr Asp Thr Phe Lys Gln Lys Leu Leu Asn Ile Thr Pro Pro 225 230 235 240 Lys Asp Leu Lys Gln Leu Gln Ser Ile Leu Gly Leu Leu Asn Phe Ala 245 250 255 Arg Asn Phe Ile Pro Asn Tyr Ser Glu Leu Val Lys Pro Leu Tyr Thr 260 265 270 Ile Val Ala Asn Ala Asn Gly Lys Phe Ile Ser Trp Thr Glu Asp Asn 275 280 285 Ser Asn Gln Leu Gln His Ile Ile Ser Val Leu Asn Gln Ala Asp Asn 290 295 300 Leu Glu Glu Arg Asn Pro Glu Thr Arg Leu Ile Ile Phe Val Asn Ser 305 310 315 320 Ser Pro Ser Ala Gly Tyr Ile Arg Tyr Tyr Asn Glu Gly Ser Lys Arg 325 330 335 Pro Ile Met Tyr Val Asn Tyr Ile Phe Ser Lys Ala Glu Ala Lys Phe 340 345 350 Thr Gln Thr Glu Lys Leu Leu Thr Thr Met His Lys Gly Leu Ile Lys 355 360 365 Ala Met Asp Leu Ala Met Gly Gln Glu Ile Leu Val Tyr Ser Pro Ile 370 375 380 Val Ser Met Thr Lys Ile Gln Arg Thr Pro Leu Pro Glu Arg Lys Ala 385 390 395 400 Leu Pro Val Arg Trp Ile Thr Trp Met Thr Tyr Leu Glu Asp Pro Arg 405 410 415 Ile Gln Phe His Tyr Asp Lys Ser Leu Pro Glu Leu Gln Gln Ile Pro 420 425 430 Asn Val Thr Glu Asp Val Ile Ala 435 440 33 440 PRT Simian foamy virus type 3 33 Val Gly His Arg Arg Ile Lys Pro His His Ile Ala Thr Gly Thr Val 1 5 10 15 Asn Pro Arg Pro Gln Lys Gln Tyr Pro Ile Asn Pro Lys Ala Lys Ala 20 25 30 Ser Ile Gln Thr Val Ile Asn Asp Leu Leu Lys Gln Gly Val Leu Ile 35 40 45 Gln Gln Asn Ser Ile Met Asn Thr Pro Val Tyr Pro Val Pro Lys Pro 50 55 60 Asp Gly Lys Trp Arg Met Val Leu Asp Tyr Arg Glu Val Asn Lys Thr 65 70 75 80 Ile Pro Leu Ile Ala Ala Gln Asn Gln His Ser Ala Gly Ile Leu Ser 85 90 95 Ser Ile Phe Arg Gly Lys Tyr Lys Thr Thr Leu Asp Leu Ser Asn Gly 100 105 110 Phe Trp Ala His Ser Ile Thr Pro Glu Ser Tyr Trp Leu Thr Ala Phe 115 120 125 Thr Trp Leu Gly Gln Gln Tyr Cys Trp Thr Arg Leu Pro Gln Gly Phe 130 135 140 Leu Asn Ser Pro Ala Leu Phe Thr Ala Asp Val Val Asp Leu Leu Lys 145 150 155 160 Glu Val Pro Asn Val Gln Val Tyr Val Asp Asp Ile Tyr Ile Ser His 165 170 175 Asp Asp Pro Arg Glu His Leu Glu Gln Leu Glu Lys Val Phe Ser Leu 180 185 190 Leu Leu Asn Ala Gly Tyr Val Val Ser Leu Lys Lys Ser Glu Ile Ala 195 200 205 Gln His Glu Val Glu Phe Leu Gly Phe Asn Ile Thr Lys Glu Gly Arg 210 215 220 Gly Leu Thr Glu Thr Phe Lys Gln Lys Leu Leu Asn Ile Thr Pro Pro 225 230 235 240 Arg Asp Leu Lys Gln Leu Gln Ser Ile Leu Gly Leu Leu Asn Phe Ala 245 250 255 Arg Asn Phe Ile Pro Asn Phe Ser Glu Leu Val Lys Pro Leu Tyr Asn 260 265 270 Ile Ile Ala Thr Ala Asn Gly Lys Tyr Ile Thr Trp Thr Thr Asp Asn 275 280 285 Ser Gln Gln Leu Gln Asn Ile Ile Ser Met Leu Asn Ser Ala Glu Asn 290 295 300 Leu Glu Glu Arg Asn Pro Glu Val Arg Leu Ile Met Lys Val Asn Thr 305 310 315 320 Ser Pro Ser Ala Gly Tyr Ile Arg Phe Tyr Asn Glu Phe Ala Lys Arg 325 330 335 Pro Ile Met Tyr Leu Asn Tyr Val Tyr Thr Lys Ala Glu Val Lys Phe 340 345 350 Thr Asn Thr Glu Lys Leu Leu Thr Thr Ile His Lys Gly Leu Ile Lys 355 360 365 Ala Leu Asp Leu Gly Met Gly Gln Glu Ile Leu Val Tyr Ser Pro Ile 370 375 380 Val Ser Met Thr Lys Ile Gln Lys Thr Pro Leu Pro Glu Arg Lys Ala 385 390 395 400 Leu Pro Ile Arg Trp Ile Thr Trp Met Ser Tyr Leu Glu Asp Pro Arg 405 410 415 Ile Gln Phe His Tyr Asp Lys Thr Leu Pro Glu Leu Gln Gln Val Pro 420 425 430 Thr Val Thr Asp Asp Ile Ile Ala 435 440 34 81 PRT Saccharomyces cerevisiae (baker′s yeast) 1BJT 34 Lys Pro Gly Gln Arg Lys Val Leu Tyr Gly Cys Phe Lys Lys Asn Leu 1 5 10 15 Lys Ser Glu Leu Lys Val Ala Gln Leu Ala Pro Tyr Val Ser Glu Cys 20 25 30 Thr Ala Tyr His His Gly Glu Gln Ser Leu Ala Gln Thr Ile Ile Gly 35 40 45 Leu Ala Gln Asn Phe Val Gly Ser Asn Asn Ile Tyr Leu Leu Leu Pro 50 55 60 Asn Gly Ala Phe Gly Thr Arg Ala Thr Gly Gly Lys Asp Ala Ala Ala 65 70 75 80 Ala 36 81 PRT Mus musculus (house mouse) JS0703 36 Lys Pro Gly Gln Arg Lys Val Leu Phe Thr Cys Phe Lys Arg Asn Asp 1 5 10 15 Lys Arg Glu Val Lys Val Ala Gln Leu Ala Gly Ser Val Ala Glu Met 20 25 30 Ser Ser Tyr His His Gly Glu Met Ser Leu Met Met Thr Ile Ile Asn 35 40 45 Leu Ala Gln Asn Phe Val Gly Ser Asn Asn Leu Asn Leu Leu Gln Pro 50 55 60 Ile Gly Gln Phe Gly Thr Arg Leu His Gly Gly Lys Asp Ser Ala Ser 65 70 75 80 Pro 37 81 PRT Cricetulus griseus (Chinese hamster) A44406 37 Lys Pro Gly Gln Arg Lys Val Leu Phe Thr Cys Phe Lys Arg Asn Asp 1 5 10 15 Lys Arg Glu Val Lys Val Ala Gln Leu Ala Gly Ser Val Ala Glu Met 20 25 30 Ser Ser Tyr His His Gly Glu Met Ser Leu Met Met Thr Ile Ile Asn 35 40 45 Leu Ala Gln Asn Phe Val Gly Ser Asn Asn Leu Asn Leu Leu Gln Pro 50 55 60 Ile Gly Gln Phe Gly Thr Arg Leu His Gly Gly Lys Asp Ser Ala Ser 65 70 75 80 Pro 38 81 PRT Homo sapiens A40493 38 Lys Pro Gly Gln Arg Lys Val Leu Phe Thr Cys Phe Lys Arg Asn Asp 1 5 10 15 Lys Arg Glu Val Lys Val Ala Gln Leu Ala Gly Ser Val Ala Glu Met 20 25 30 Ser Ser Tyr His His Gly Glu Met Ser Leu Met Met Thr Ile Ile Asn 35 40 45 Leu Ala Gln Asn Phe Val Gly Ser Asn Asn Leu Asn Leu Leu Gln Pro 50 55 60 Ile Gly Gln Phe Gly Thr Arg Leu His Gly Gly Lys Asp Ser Ala Ser 65 70 75 80 Pro 39 81 PRT Homo sapiens A39242 39 Lys Pro Gly Gln Arg Lys Val Leu Phe Thr Cys Phe Lys Arg Asn Asp 1 5 10 15 Lys Arg Glu Val Lys Val Ala Gln Leu Ala Gly Ser Val Ala Glu Met 20 25 30 Ser Ala Tyr His His Gly Glu Gln Ala Leu Met Met Thr Ile Val Asn 35 40 45 Leu Ala Gln Asn Phe Val Gly Ser Asn Asn Ile Asn Leu Leu Gln Pro 50 55 60 Ile Gly Gln Phe Gly Thr Arg Leu His Gly Gly Lys Asp Ala Ala Ser 65 70 75 80 Pro 40 81 PRT Cricetulus griseus (Chinese hamster) S59969 40 Lys Pro Gly Gln Arg Lys Val Leu Phe Thr Cys Phe Lys Arg Asn Asp 1 5 10 15 Lys Arg Glu Val Lys Val Ala Gln Leu Ala Gly Ser Val Ala Glu Met 20 25 30 Ser Ala Tyr His His Gly Glu Gln Ala Leu Met Met Thr Ile Val Asn 35 40 45 Leu Ala Gln Asn Phe Val Gly Ser Asn Asn Ile Asn Leu Leu Gln Pro 50 55 60 Ile Gly Gln Phe Gly Thr Arg Leu His Gly Gly Lys Asp Ala Ala Ser 65 70 75 80 Pro 41 81 PRT Drosophila melanogaster (fruit fly) S02160 41 Lys Pro Gly Gln Arg Lys Val Met Phe Thr Cys Phe Lys Arg Asn Asp 1 5 10 15 Lys Arg Glu Val Lys Val Ala Gln Leu Ser Gly Ser Val Ala Glu Met 20 25 30 Ser Ala Tyr His His Gly Glu Gln Ser Leu Gln Met Thr Ile Val Asn 35 40 45 Leu Ala Gln Asn Phe Val Gly Ala Asn Asn Ile Asn Leu Leu Glu Pro 50 55 60 Arg Gly Gln Phe Gly Thr Arg Leu Ser Gly Gly Lys Asp Cys Ala Ser 65 70 75 80 Ala 42 83 PRT Schizosaccharomyces pombe (fission yeast)ISZPT2 42 Lys Pro Gly Gln Arg Lys Val Val Thr Thr Cys Phe Lys Arg Asn Leu 1 5 10 15 Glu Trp Val His Glu Thr Lys Val Ser Arg Leu Ala Gly Thr Val Ala 20 25 30 Ser Glu Thr Ala Tyr His His Gly Glu Gln Ser Met Glu Gln Thr Ile 35 40 45 Val Asn Leu Ala Gln Asn Phe Val Gly Ser Asn Asn Ile Asn Leu Leu 50 55 60 Met Pro Asp Gly Gln Phe Gly Thr Arg Ser Glu Gly Gly Lys Asn Ala 65 70 75 80 Ser Ala Ser 43 81 PRT Arabidopsis thaliana (thale cress)S53599 43 Lys Pro Gly Gln Arg Lys Ile Leu Phe Val Ala Phe Lys Lys Ile Ala 1 5 10 15 Arg Lys Glu Met Lys Val Ala Gln Leu Val Gly Tyr Val Ser Leu Leu 20 25 30 Ser Ala Tyr His His Gly Glu Gln Ser Leu Ala Ser Ala Ile Ile Gly 35 40 45 Met Ala Gln Asp Tyr Val Gly Ser Asn Asn Ile Asn Leu Leu Leu Pro 50 55 60 Asn Gly Gln Phe Gly Thr Arg Thr Ser Gly Gly Lys Asp Ser Ala Ser 65 70 75 80 Ala 44 75 PRT Pisum sativum (pea) T06819 44 Asp Leu Phe Gly Ser Phe Lys Lys Lys Leu Tyr Lys Glu Ile Lys Val 1 5 10 15 Ala Gln Phe Ile Gly Tyr Val Ser Glu His Ser Ala Tyr His His Gly 20 25 30 Glu Gln Ser Leu Ala Ser Thr Ile Ile Gly Met Ala Gln Asp Phe Val 35 40 45 Gly Ser Asn Asn Ile Asn Leu Leu Lys Pro Asn Gly Gln Phe Gly Thr 50 55 60 Cys Asn Leu Gly Gly Lys Asp His Ala Ser Ala 65 70 75 45 84 PRT African swine fever virus ISXFAS 45 Thr Arg Ala Arg Arg Lys Ile Leu Ala Gly Gly Val Lys Cys Phe Ala 1 5 10 15 Ser Asn Asn Arg Glu Arg Lys Val Phe Gln Phe Gly Gly Tyr Val Ala 20 25 30 Asp His Met Phe Tyr His His Gly Asp Met Ser Leu Asn Thr Ser Ile 35 40 45 Ile Lys Ala Ala Gln Tyr Tyr Pro Gly Ser Ser His Leu Tyr Pro Val 50 55 60 Phe Ile Gly Ile Gly Ser Phe Gly Ser Arg His Leu Gly Gly Lys Asp 65 70 75 80 Ala Gly Ser Pro 46 84 PRT African swine fever virus S27329 46 Thr Arg Ala Arg Arg Lys Ile Leu Ala Gly Gly Leu Lys Cys Phe Ala 1 5 10 15 Ser Asn Asn Arg Glu Arg Lys Val Phe Gln Phe Gly Gly Tyr Val Ala 20 25 30 Asp His Met Phe Tyr His His Gly Asp Met Ser Leu Asn Thr Ser Ile 35 40 45 Ile Lys Ala Ala Gln Tyr Tyr Pro Gly Ser Ser His Leu Tyr Pro Val 50 55 60 Phe Ile Gly Ile Gly Ser Phe Gly Ser Arg His Leu Gly Gly Lys Asp 65 70 75 80 Ala Gly Ser Pro 47 81 PRT Saccharomyces cerevisiae (baker′s yeast) 1BJT 47 Lys Pro Gly Gln Arg Lys Val Leu Tyr Gly Cys Phe Lys Lys Asn Leu 1 5 10 15 Lys Ser Glu Leu Lys Val Ala Gln Leu Ala Pro Tyr Val Ser Glu Cys 20 25 30 Thr Ala Tyr His His Gly Glu Gln Ser Leu Ala Gln Thr Ile Ile Gly 35 40 45 Leu Ala Gln Asn Phe Val Gly Ser Asn Asn Ile Tyr Leu Leu Leu Pro 50 55 60 Asn Gly Ala Phe Gly Thr Arg Ala Thr Gly Gly Lys Asp Ala Ala Ala 65 70 75 80 Ala 48 81 PRT Rattus norvegicus (Norway rat) JN0598 48 Lys Pro Gly Gln Arg Lys Val Leu Phe Thr Cys Phe Lys Arg Asn Asp 1 5 10 15 Lys Arg Glu Val Lys Val Ala Gln Leu Ala Gly Ser Val Ala Glu Met 20 25 30 Ser Ser Tyr His His Gly Glu Met Ser Leu Met Met Thr Ile Ile Asn 35 40 45 Leu Ala Gln Asn Phe Val Gly Ser Asn Asn Leu Asn Leu Leu Gln Pro 50 55 60 Ile Gly Gln Phe Gly Thr Arg Leu His Gly Gly Lys Asp Ser Ala Ser 65 70 75 80 Pro 49 81 PRT Mus musculus (house mouse) JS0703 49 Lys Pro Gly Gln Arg Lys Val Leu Phe Thr Cys Phe Lys Arg Asn Asp 1 5 10 15 Lys Arg Glu Val Lys Val Ala Gln Leu Ala Gly Ser Val Ala Glu Met 20 25 30 Ser Ser Tyr His His Gly Glu Met Ser Leu Met Met Thr Ile Ile Asn 35 40 45 Leu Ala Gln Asn Phe Val Gly Ser Asn Asn Leu Asn Leu Leu Gln Pro 50 55 60 Ile Gly Gln Phe Gly Thr Arg Leu His Gly Gly Lys Asp Ser Ala Ser 65 70 75 80 Pro 50 81 PRT Cricetulus griseus (Chinese hamster) A44406 50 Lys Pro Gly Gln Arg Lys Val Leu Phe Thr Cys Phe Lys Arg Asn Asp 1 5 10 15 Lys Arg Glu Val Lys Val Ala Gln Leu Ala Gly Ser Val Ala Glu Met 20 25 30 Ser Ser Tyr His His Gly Glu Met Ser Leu Met Met Thr Ile Ile Asn 35 40 45 Leu Ala Gln Asn Phe Val Gly Ser Asn Asn Leu Asn Leu Leu Gln Pro 50 55 60 Ile Gly Gln Phe Gly Thr Arg Leu His Gly Gly Lys Asp Ser Ala Ser 65 70 75 80 Pro 51 81 PRT Homo sapiens A40493 51 Lys Pro Gly Gln Arg Lys Val Leu Phe Thr Cys Phe Lys Arg Asn Asp 1 5 10 15 Lys Arg Glu Val Lys Val Ala Gln Leu Ala Gly Ser Val Ala Glu Met 20 25 30 Ser Ser Tyr His His Gly Glu Met Ser Leu Met Met Thr Ile Ile Asn 35 40 45 Leu Ala Gln Asn Phe Val Gly Ser Asn Asn Leu Asn Leu Leu Gln Pro 50 55 60 Ile Gly Gln Phe Gly Thr Arg Leu His Gly Gly Lys Asp Ser Ala Ser 65 70 75 80 Pro 52 81 PRT Homo sapiens A39242 52 Lys Pro Gly Gln Arg Lys Val Leu Phe Thr Cys Phe Lys Arg Asn Asp 1 5 10 15 Lys Arg Glu Val Lys Val Ala Gln Leu Ala Gly Ser Val Ala Glu Met 20 25 30 Ser Ala Tyr His His Gly Glu Gln Ala Leu Met Met Thr Ile Val Asn 35 40 45 Leu Ala Gln Asn Phe Val Gly Ser Asn Asn Ile Asn Leu Leu Gln Pro 50 55 60 Ile Gly Gln Phe Gly Thr Arg Leu His Gly Gly Lys Asp Ala Ala Ser 65 70 75 80 Pro 53 81 PRT Cricetulus griseus (Chinese hamster) S59969 53 Lys Pro Gly Gln Arg Lys Val Leu Phe Thr Cys Phe Lys Arg Asn Asp 1 5 10 15 Lys Arg Glu Val Lys Val Ala Gln Leu Ala Gly Ser Val Ala Glu Met 20 25 30 Ser Ala Tyr His His Gly Glu Gln Ala Leu Met Met Thr Ile Val Asn 35 40 45 Leu Ala Gln Asn Phe Val Gly Ser Asn Asn Ile Asn Leu Leu Gln Pro 50 55 60 Ile Gly Gln Phe Gly Thr Arg Leu His Gly Gly Lys Asp Ala Ala Ser 65 70 75 80 Pro 54 81 PRT Drosophila melanogaster (fruit fly) S02160 54 Lys Pro Gly Gln Arg Lys Val Met Phe Thr Cys Phe Lys Arg Asn Asp 1 5 10 15 Lys Arg Glu Val Lys Val Ala Gln Leu Ser Gly Ser Val Ala Glu Met 20 25 30 Ser Ala Tyr His His Gly Glu Gln Ser Leu Gln Met Thr Ile Val Asn 35 40 45 Leu Ala Gln Asn Phe Val Gly Ala Asn Asn Ile Asn Leu Leu Glu Pro 50 55 60 Arg Gly Gln Phe Gly Thr Arg Leu Ser Gly Gly Lys Asp Cys Ala Ser 65 70 75 80 Ala 55 81 PRT Schizosaccharomyces pombe (fission yeast)ISZPT2 55 Lys Pro Gly Gln Arg Lys Val Val Tyr Tyr Cys Phe Lys Arg Asn Leu 1 5 10 15 Val His Glu Thr Lys Val Ser Arg Leu Ala Gly Tyr Val Ala Ser Glu 20 25 30 Thr Ala Tyr His His Gly Glu Gln Ser Met Glu Gln Thr Ile Val Asn 35 40 45 Leu Ala Gln Asn Phe Val Gly Ser Asn Asn Ile Asn Leu Leu Met Pro 50 55 60 Asn Gly Gln Phe Gly Thr Arg Ser Glu Gly Gly Lys Asn Ala Ser Ala 65 70 75 80 Ser 56 81 PRT Arabidopsis thaliana (thale cress)S53599 56 Lys Pro Gly Gln Arg Lys Ile Leu Phe Val Ala Phe Lys Lys Ile Ala 1 5 10 15 Arg Lys Glu Met Lys Val Ala Gln Leu Val Gly Tyr Val Ser Leu Leu 20 25 30 Ser Ala Tyr His His Gly Glu Gln Ser Leu Ala Ser Ala Ile Ile Gly 35 40 45 Met Ala Gln Asp Tyr Val Gly Ser Asn Asn Ile Asn Leu Leu Leu Pro 50 55 60 Asn Gly Gln Phe Gly Thr Arg Thr Ser Gly Gly Lys Asp Ser Ala Ser 65 70 75 80 Ala 57 75 PRT Pisum sativum (pea) T06819 57 Asp Leu Phe Gly Ser Phe Lys Lys Lys Leu Tyr Lys Glu Ile Lys Val 1 5 10 15 Ala Gln Phe Ile Gly Tyr Val Ser Glu His Ser Ala Tyr His His Gly 20 25 30 Glu Gln Ser Leu Ala Ser Thr Ile Ile Gly Met Ala Gln Asp Phe Val 35 40 45 Gly Ser Asn Asn Ile Asn Leu Leu Lys Pro Asn Gly Gln Phe Gly Thr 50 55 60 Cys Asn Leu Gly Gly Lys Asp His Ala Ser Ala 65 70 75 58 84 PRT African swine fever virus ISXFAS 58 Thr Arg Ala Arg Arg Lys Ile Leu Ala Gly Gly Val Lys Cys Phe Ala 1 5 10 15 Ser Asn Asn Arg Glu Arg Lys Val Phe Gln Phe Gly Gly Tyr Val Ala 20 25 30 Asp His Met Phe Tyr His His Gly Asp Met Ser Leu Asn Thr Ser Ile 35 40 45 Ile Lys Ala Ala Gln Tyr Tyr Pro Gly Ser Ser His Leu Tyr Pro Val 50 55 60 Phe Ile Gly Ile Gly Ser Phe Gly Ser Arg His Leu Gly Gly Lys Asp 65 70 75 80 Ala Gly Ser Pro 59 84 PRT African swine fever virus S27329 59 Thr Arg Ala Arg Arg Lys Ile Leu Ala Gly Gly Leu Lys Cys Phe Ala 1 5 10 15 Ser Asn Asn Arg Glu Arg Lys Val Phe Gln Phe Gly Gly Tyr Val Ala 20 25 30 Asp His Met Phe Tyr His His Gly Asp Met Ser Leu Asn Thr Ser Ile 35 40 45 Ile Lys Ala Ala Gln Tyr Tyr Pro Gly Ser Ser His Leu Tyr Pro Val 50 55 60 Phe Ile Gly Ile Gly Ser Phe Gly Ser Arg His Leu Gly Gly Lys Asp 65 70 75 80 Ala Gly Ser Pro 

What is claimed is:
 1. A method of designing a molecular structure of an inhibitor on a target enzyme whose secondary and tertiary structures are known, by analyzing, in the tertiary structure of a first enzyme whose primary, secondary and tertiary structures are known, a geometric configuration formed by amino acids to which a first inhibitor, having biological, inhibitory activity on the first enzyme, is to bind; and searching, in the target enzyme, the similar geometric configuration to the geometric configuration that was analyzed in the first enzyme; thereby designing the molecular structure of the inhibitor on the target enzyme, wherein the method comprises: (1) determining, in the tertiary structure of the first enzyme, a first pocket amino acid group consisting of plural amino acids to which the first inhibitor is to bind, wherein the plural amino acids are determined by comparing ¹H-¹⁵N HMQC spectroscopy of the first enzyme in the absence of the first inhibitor with ¹H-¹⁵N HMQC spectroscopy of the first enzyme in the presence of the first inhibitor, and selecting amino acids that satisfy the condition that the absolute value of ¹H chemical shift is equal to or more than a specific value and/or the absolute value of ¹⁵N chemical shift is equal to or more than another specific value; (2) determining the geometric configuration of the first pocket amino acid group, by measuring, in the tertiary structure of the first enzyme, distances between the Cα atoms of each of the plural amino acids constituting the first pocket amino acid group; (3) detecting, in the tertiary structure of the target enzyme, one or more candidates for a target pocket amino acid group having a geometric configuration similar to the geometric configuration formed by the first pocket amino acid group determined in (2), wherein it is presumed that the candidate for the target pocket amino acid group exists on the target enzyme if the following conditions are satisfied: respective amino acids constituting the candidate for the target pocket amino acid group are of the same kinds as the plural amino acids constituting the first pocket amino acid group determined in (2), and the absolute values of differences between (a) and (b) are within a specific value, wherein (a) is respective distances, in the tertiary structure of the target enzyme, between the Cα atoms of each of the amino acids constituting the target pocket amino acid group, and (b) is respective distances between the Cα atoms in the first pocket amino acid group obtained in (2); and (4) screening the amino acids constituting the candidate for the target pocket amino acid group by the following two requirements: (requirement 1) the amino acids are those present at the surface of the target enzyme in the tertiary structure thereof; and (requirement 2) the amino acids are those that have been conserved among species, when a comparison is made between the primary sequence of the amino acids of the target enzyme and the primary sequence of the amino acids of another enzyme having the same biochemical activity as the target enzyme but derived from an organism of a different species from that of the target enzyme, in terms of biological classification; and determining the amino acids satisfying the above two requirements to be those constituting the usable target pocket amino acid group for designing the molecular structure of the inhibitor on the target enzyme; thereby the molecular structure of the inhibitor on the target enzyme is designed on the basis of the kinds of amino acid residues of the first enzyme, to which the first inhibitor is to bind, and the geometric configuration of the amino acid residues.
 2. The method according to claim 1, wherein the first enzyme is rat DNA polymerase β, the first inhibitor is nervonic acid and the target enzyme is human immunodeficiency virus reverse transcriptase; in said determining amino acids of (1), the amino acids satisfying the absolute value of ¹H chemical shift of 0.06 ppm or more, or the absolute value of ¹⁵N chemical shift of 0.4 ppm or more, are selected thereby a rat DNA polymerase β pocket amino acid group is determined to be Leu-11, Lys-35, His-51 and Thr-79; in said determining the geometric configuration of (2), the geometric configuration of a triangular pyramid formed by Leu-11, Lys-35, His-51 and Thr-79 is determined; in said detecting the tertiary structure of (3), the values of the differences between the respective distances of Cα atoms of each of Leu, Lys, His and Thr in the tertiary structure of human immunodeficiency virus reverse transcriptase and the respective distances of Cα atoms obtained in (2), are set to be ±2×10⁻¹ nm, thereby determining the candidate for the target pocket amino acid group; in said screening of (4), the usable target pocket amino acid group for designing the molecular structure of the inhibitor is determined to be Lys-65, Leu-100, His-235 and Thr-386 of human immunodeficiency virus reverse transcriptase; and thereby the molecular structure of the inhibitor on human immunodeficiency virus reverse transcriptase is designed on the basis of the kinds of amino acid residues of rat DNA polymerase β to which nervonic acid is to bind, and the geometric configuration of the amino acid residues.
 3. A method of designing a molecular structure of an inhibitor on a target enzyme whose tertiary structure is unknown, by analyzing, in the tertiary structure of a first enzyme whose primary, secondary and tertiary structures are known, a geometric configuration formed by amino acids to which a first inhibitor, having biological, inhibitory activity on the first enzyme, is to bind; and searching, in a second enzyme whose primary, secondary and tertiary structures are known, the similar geometric configuration to the geometric configuration that was analyzed in the first enzyme; thereby designing the molecular structure of the inhibitor on the target enzyme, wherein the method comprises: (1) determining, in the tertiary structure of the first enzyme, a first pocket amino acid group consisting of plural amino acids to which the first inhibitor is to bind, wherein the plural amino acids are determined by comparing ¹H-¹⁵N HMQC spectroscopy of the first enzyme in the absence of the first inhibitor with ¹H-¹⁵N HMQC spectroscopy of the first enzyme in the presence of the first inhibitor, and selecting amino acids that satisfy the condition that the absolute value of ¹H chemical shift is equal to or more than a specific value and/or the absolute value of ¹⁵N chemical shift is equal to or more than another specific value; (2) determining the geometric configuration of the first pocket amino acid group, by measuring, in the tertiary structure of the first enzyme, the distances between the Cα atoms of each of the plural amino acids constituting the first pocket amino acid group; (3) detecting, in the tertiary structure of the second enzyme, a second pocket amino acid group having a geometric configuration similar to the geometric configuration formed by the first pocket amino acid group determined in (2), wherein it is presumed that the second pocket amino acid group has a geometric configuration similar to the geometric configuration formed by the first pocket amino acid group if the following requirements are satisfied: (requirement 1) the second enzyme has the same biochemical activity as that of the target enzyme; (requirement 2) the organism from which the second enzyme is derived, belongs to a different species, in terms of biological classification, from the organism from which the target enzyme is derived; (requirement 3) the primary, secondary and tertiary structures of the second enzyme are known; and (requirement 4) the second enzyme is biochemically inhibited by the first inhibitor, and respective amino acids constituting the second pocket amino acid group are of the same kinds as the plural amino acids constituting the first pocket amino acid group determined in (2), and the absolute values of differences between (a) and (b) are within a specific value, wherein (a) is respective distances, in the tertiary structure of the second enzyme, between the Cα atoms of each of the amino acids constituting the second pocket amino acid group, and (b) is respective distances between the Cα atoms in the first pocket amino acid group obtained in (2); and (4) screening the amino acids constituting the second pocket amino acid group by the following two requirements: (requirement 1) the amino acids are those present at the surface of the second enzyme in the tertiary structure thereof; and (requirement 2) the amino acids are those that have been conserved among species, when a comparison is made among the primary sequence of the amino acids of the second enzyme, the primary sequence of the amino acid of the target enzyme and the primary sequence of the amino acids of another enzyme having the same biochemical activity as that of the second enzyme but derived from an organism of a different species from that of both the second enzyme and the target enzyme, in terms of biological classification; and determining the amino acids satisfying the above two requirements to be those constituting the usable target pocket amino acid group for designing the molecular structure of the inhibitor on the target enzyme; thereby the molecular structure of the inhibitor on the target enzyme is designed on the basis of the kinds of amino acid residues of the first enzyme, to which the first inhibitor is to bind, and the geometric configuration of the amino acid residues.
 4. The method according to claim 3, wherein the first enzyme is rat DNA polymerase β, the first inhibitor is nervonic acid, the second enzyme is yeast DNA topoisomerase II and the target enzyme is human DNA topoisomerase II; in said determining amino acids of (1), the amino acids satisfying the absolute value of ¹H chemical shift of 0.06 ppm or more, or the absolute value of ¹⁵N chemical shift of 0.4 ppm or more, are selected thereby a rat DNA polymerase P pocket amino acid group is determined to be Leu-11, Lys-35, His-51 and Thr-79; in said determining the geometric configuration of (2), the geometric configuration of a triangular pyramid formed by Leu-11, Lys-35, His-51 and Thr-79 is determined; in said detecting the tertiary structure of (3), the values of the differences between the respective distances of Cα atoms of each of Leu, Lys, His and Thr in the tertiary structure of yeast DNA topoisomerase II and the respective distances of Cα atoms obtained in (2), are set to be ±2×10⁻¹ nm, thereby determining the yeast DNA topoisomerase II pocket amino acid group; in said screening of (4), the usable target pocket amino acid group for designing the molecular structure of the inhibitor is determined to be Thr-596, His-735, Leu-741 and Lys-983 of yeast DNA topoisomerase II; and thereby the molecular structure of the inhibitor on human DNA topoisomerase II is designed on the basis of the kinds of amino acid residues of yeast DNA topoisomerase II, to which nervonic acid is to bind, and the geometric configuration of the amino acid residues.
 5. The method according to claim 3, wherein the first enzyme is rat DNA polymerase β, the first inhibitor is lithocholic acid, the second enzyme is yeast DNA topoisomerase II and the target enzyme is human DNA topoisomerase II; in said determining amino acids of (1), the amino acids satisfying the absolute value of ¹H chemical shift of 0.06 ppm or more, or the absolute value of ¹⁵N chemical shift of 0.4 ppm or more, are selected thereby a rat DNA polymerase β pocket amino acid group is determined to be Lys-60, Leu-77 and Thr-79; in said determining the geometric configuration of (2), the geometric configuration of a triangle formed by Lys-60, Leu-77 and Thr-79 is determined; in said detecting the tertiary structure of (3), the values of the differences between the respective distances of Cα atoms of each of Lys, Leu and Thr in the tertiary structure of yeast DNA topoisomerase II and the respective distances of Cα atoms obtained in (2), are set to be ±2×10⁻¹ nm, thereby determining the yeast DNA topoisomerase II pocket amino acid group; in said screening of (4), the usable target pocket amino acid group for designing the molecular structure of the inhibitor is determined to be Lys-720, Leu-760 and Thr-791 of yeast DNA topoisomerase II; and thereby the molecular structure of the inhibitor on human DNA topoisomerase II is designed on the basis of the kinds of amino acid residues of yeast DNA topoisomerase II, to which lithocholic acid to bind, and the geometric configuration of the amino acid residues. 